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PROPERTIES OF LEARNING CURVES UNDER VARIED 
DISTRIBUTIONS OF PRACTICE* 


BY MARY J. KIENTZLE 
The Procter and Gamble Company 


Since the beginning of scientific psychology, there have been 
systematic studies of spaced practice. In many early investigations, 
emphasis lay on finding the optimal distribution of effort. Most 
of the recent experiments, however, have been directed toward the 
refinement of learning theories. 

The present study regards spacing of practice as one of many 
conditions which affect score in a task. Thus it represents a phase 
of the general problem of predicting changes in score from known 
changes in the conditions which determine score. Any score may 
be expressed as a function of its determining conditions by the follow- 
ing equation: 


y = f(x, NQ, X38, + + « Sea) (1) 


in which y, the score, is a measure of some characteristic of perform- 
ance, and in which the x’s are measures of attributes of determining 
conditions. Thus, y might be a measure of accuracy, of speed, or of 
any other quality of performance; the x’s might be measures of age 
of the S, intensity of illumination, number of trials, or of any other 
qualities of determining conditions. The function f is not defined 
here; it is used only to indicate that score depends on, or, is a function 
of, its determiners. The underlying problem, in terms of which the 
present investigation was designed, is to find the exact expression 
for equation (1), so that f may be a specifically formulated law in- 
stead of a mere indication of relationship. 


* An abstract of a thesis submitted in partial fulfillment of the requirements for the degree 
of Doctor of Philosophy in Psychology in the Graduate School of the University of Illinois, 1945. 

The writer wishes to express her appreciation of the aid of Professor Herbert Woodrow, 
under whose direction this research was conducted. 
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For convenience, the x’s in equation (1) are usually considered in 
two large groups, namely: experimental variables and subject vari- 
ables. Equation (1), rewritten to indicate this grouping, becomes 


y = f(£, S), (2) 


in which £ represents the totality of experimental variables and S, 
the totality of subject variables. ‘Experimental variables’ are those 
which may be changed arbitrarily so that their effect on y can be 
charted at successive points; ‘subject variables’ are those which 
account for differences among individuals’ scores when experimental 
conditions are held constant. To study the relationship between 
score and experimental variables, one employs direct experimental 
methods and uses an average score from many subjects. Thus he 
treats variation among individuals under fixed conditions as error of 
observation. On the other hand, when one studies the relationship 
between score and subject variables, he tries to explain variation 
among individuals. Statistical methods are used to isolate the 
sources of individual differences. The classification of determiners 
into experimental and subject variables is made as a convenience to 
the experimenter and not as a logical dichotomy. From one stand- 
point a determining condition may be an experimental variable 
while from another it may be a subject variable. For example, score 
depends on amount of practice. Part of the practice may be given 
under experimental conditions and its effects observed. And yet, 
there may exist wide differences among Ss in amount of pre-experi- 
mental practice or its equivalent. Although amount of practice is 
logically the same determiner, whether it occurs during the experi- 
ment or pre-experimentally, convenience and existing quantitative 
methods make it desirable to consider it an experimental variable in 
the first instance and a subject variable in the second. Although 
the methods of investigation are not alike for the two classes of 
variables, the fundamental problem remains the same, that is, to 
discover the relationship indicated by equation (1). 

The specific problem of this paper, that is, the relationship of 
score to number of practice trials and to amount of rest between trials, 
has been considered from two viewpoints. The first regards the 
dependence of only the mean scores upon number of trials and dura- 
tion of rest. This function was studied by fitting rectangular hyper- 
bolas to mean learning curves obtained under varied distributions of 
practice, and by comparing the parameters from the different curves. 
However, after mean score has been expressed as a function of the 
two experimental variables, there still remains the problem of ex- 
plaining variation from that mean score, or, the task of accounting 
for individual differences. The scores were therefore analyzed from a 
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second viewpoint, namely, that which considers them as dependent 
upon characteristics of the Ss, or what have here been termed subject 
variables. To arrive at these subject variables, intercorrelations 
among scores on designated trials were factor analyzed. Further- 
more, relationships between subject variables and the two experi- 
mental conditions were determined by studying group variability as 
a function of number of trials and of duration of rest. 


PROCEDURE 


Practice curves were obtained from 12 groups of Ss. In all cases, the trials lasted for one min. 
and were separated by rest pauses varying from no rest at all to seven days’ rest. The 12 groups 
differed from one another in the duration of the interval which was regularly interposed between 
trials. Table I gives a description of the groups. 











TABLE I 
DEscRIPTION OF EXPERIMENTAL Groups 

Group Number of Subjects Duration of Rest Number of Trials 
I 46 O sec. 20 
Ia 45 O sec. 70 
II 63 3 sec. 20 
Ill 52 5 sec. 20 
IV 45 10 sec. 20 
V 54 15 sec. 20 
VI 56 30 sec. 20 
VII 42 45 sec. 20 
Vill 54 60 sec. 20 
Villa 55 60 sec. (tallies) 10 
IX 46 go sec. 15 
x 35 7 days 12 
Total 593 














The number of trials per group varies because of the limited time available. Most of the 
data were collected during a 50-min. class hour. Therefore, it was impossible to obtain 20 trials 
from Group IX. Likewise, there was no opportunity to have 20 trials for Group X, which had 
a week’s rest between trials. 

The Ss were elementary psychology students, mostly sophomores, at the University of 
Illinois. Except for Group Ia, the students performed the task as part of their regular classroom 
work. Altogether, 33 sections participated. Group Ia differed in that it met outside of class 
time and in that the Ss, also elementary psychology students, were paid for their services. Ex- 
cept for Group X, all the data from any S were collected during one session. Group X had one 
trial per week, and, in the final treatment of the data, consisted only of those Ss who were present 
for all 12 trials. 

The task, which is modeled after one given as a class demonstration in Ruch and Warren’s 
Working with psychology (8), is that of printing the alphabet upside down in such a fashion that 
when the paper is turned through a 180-degree angle, the alphabet can be read from left to right 
in the usual manner. This task yields a sizable score in a relatively short time, and it is one in 
which young adults show rather rapid improvement. 

During the trials, which were one min. long, the Ss were to print with a lead pencil as rapidly 
and as accurately as possible the inverted alphabet, making one letter in each square of the four- 
to-the-inch cross-section paper provided as test blanks. As soon as they completed a line, they 
covered it with a ‘cover sheet’ in order to preclude the possibility of copying from one line to the 











190 MARY J. KIENTZLE 


next. During the rest periods, the Ss were not to look at their papers and not to think or to 
talk about the task. So far as E could detect, these instructions were followed. 

For all Ss except Group Ia and Group X, the test blanks consisted of three sheets of four-to- 
the-inch cross section paper, eight and one-half in. wide and eleven in. long. The top sheet was 
designated ‘cover sheet’ and, as was mentioned earlier, was used only to cover each line of printing 
as it was completed. The other two sheets were numbered along the right-hand side to indicate 
starting points for each trial. It was possible to put 10 trials on a page. Group Ia had a similar 
set of blanks, with seven sheets to provide space for 70 trials. Group X was given a single piece 
of fresh cross section paper each week and a cover sheet to go with it. 

All groups except Group VIIIa and Group X simply sat during the rest interval. They did 
not practice, nor did they talk about the experiment. However, they could, and did, talk about 
other things during the longer rest intervals. Group VIIIa spent its ‘rest pause’ making tallies 
and cross-bars, that is, four vertical lines with a horizontal line through them, in each 
little square. This ‘rest’ lasted for one min. The Ss alternated between this task and the 
inverted alphabet. Only the trials from the inverted alphabet were scored. Group I and Group 
Ia had no rest at all between trials. At the end of each minute they were told, “Go to the next 
trial.” That is, they simply worked a line or so lower on the same sheet, without any loss of 
time. Group X, which had a seven-day rest between trials, was instructed not to practice out- 
side the experimental periods. 

For all groups, the score was the number of letters written per trial, irrespective of accuracy. 
The scores seem satisfactory as to reliability; this subject will be discussed later. 


RESULTS 


Mean scores in writing the inverted alphabet under the conditions 
described are presented in Table II. This table shows, for each 
experimental group, the average number of letters written and the 
standard deviation of the distribution at every trial. It also gives 
the standard deviation at each trial as a proportion of the standard 
deviation of the first trial. 


Mean Scores AS FunctTions OF NUMBER OF TRIALS 
AND OF DURATION OF REST 


Statistical comparisons of initial mean scores and of initial stand- 
ard deviations indicated that the 12 experimental groups were not 
significantly dissimilar at the beginning of practice. ‘Therefore, no 
attempt was made to match the groups by discarding some of the 
scores. 

Calculations made from the data in Table II lead to the conclusion 
that groups with distributed practice gain more than groups with 
massed practice. Three measures of gain were computed by taking 
the differences between the means for the tenth and first trials, the 
twentieth and the tenth, and the twentieth and first trials. Fig. 1 
shows each of these measures of gain to be an increasing but negatively 
accelerated function of the duration of the rest interval. 

Despite occasional inversions in the data, the trend is for the 
longer rest periods to show the greater gains. However, there is a 
limit to the advantage of added rest; Group X, with seven days’ rest, 
does not show a higher mean gain than that of Group VII, with 45 
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TABLE II (Continued) 


MEANS AND STANDARD DEVIATIONS OF THE DISTRIBUTIONS AT EACH TRIAL 





Trial 
Group* 





21 22 23 24 25 26 27 28 29 30 





Ia |M 47.0 | 49.9 52.0 54.2 53-2 54. 
o 8.59} 7-44 | 8.43 | 7-46] 9.35 | 7: 









































a/o, 1.05 .gI 1.03 92 1.15 .94 1.07 1.07 1.02 .96 





Group 





3t 32 33 34 35 36 37 38 39 40 





Ia |M 53.4 | 55.0 | 55-9 | 55-2 | 57.0 | 5S. 


5 55-7 | 56.0 | 56.4 | 57.2 
o 9.31} 9.26 8.82 9.61 8.47 8. 


3 
97 9.20 8.66 8.64 8.26 












































a/o, 1.14] 1.14 1.08 1.18 1.04 1.10 1.13 1.06 1.06 1.01 
























































Trial 
Group 
41 42 43 44 45 46 47 48 49 50 
Ia |M 53.8 | 57-5 | 589 | 59.3 | 568 | 57-7 | 59-4 | 58-3 | 59-9 | 55-3 
o 8.10] 8.26 8.91 7.02 8.18 9.00 9.22 | 10.09 | 10.18 8.72 
a/o, .99| 1.01 1.09 .86 1.00 1.10 1.13 1.24 1.25 1.07 
Trial 
Group 
51 52 53 54 55 56 57 58 59 60 





Ia |M 55.2 | 59.1 58.5 60.1 60.1 59-9 60.9 58.8 60.0 59.8 
¢ 9.81] 9.37 | 10.07 9.66 | 11.05 | 10.64 9.76 | 10.70 | 10.38 | 10.16 















































a/o, 1.20] 1.15 1.24 1.18 1.36 1.30 1.20 1.31 1.27 1.25 
Trial 
Group 
61 62 63 64 65 66 67 68 69 70 





Ia |M 56.9 | 59.3 60.7 59.6 | 62.4 59-7 | 60.8 | 62.1 60.9 59.0 
o 10.66| 7.78 | 11.29 9.45 9.97 | 10.18 | 10.20 | 10.40 9.98 | 10.62 














afoy 1.31 95 1.38 1.16 1.22 1.25 1.25 1.28 1.22 1.30 



































* Only Group Ia had more than 20 trials. 


sec. rest. In fact, so far as the obtained mean gains are concerned, a 
rest of 45 sec. produces the same amount of improvement as the 
longer rest intervals. Up to a rest pause of 45 sec., longer rest 
periods show progressively less increase in gain. As shown by Fig. 
I, the longer rest groups are building their superiority at all parts 
of the curve. Longer rests are more beneficial than the shorter ones 
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both during the first and during the later trials. Of course, this 
difference among groups in amount gained tends to vanish as com- 
plete learning is approached. The gain from one of the early trials 
to any succeeding trial may continue to show the superiority of the 
rested groups, but the gains between two trials very late in learning 
are likely to approach zero and hence to be equal. Within the limits 
of the curves obtained here, this effect does not appear, for all the 
groups are still gaining at the end of practice. 
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Fic. 1. Mean gains under varied distributions of practice. (Legend: filled circles, gains 
between Trial 1 and Trial 20; empty circles, gains between Trial 1 and Trial 10; triangles, gains 
between Trial 10 and Trial 20.) 


The preceding paragraphs have given a picture of improvement 
under the different conditions of distribution. The ensuing discus- 
sion considers the means themselves instead of the gains between 
trials. The general method of studying the dependence of mean 
scores on their determining conditions was to fit the learning curves 
obtained under each condition of spacing to equilateral hyperbolas, 
and to observe changes in the parameters of those equations under 
changes in duration of rest. The equilateral hyperbola 


(x—a)(y—b) =k 


was chosen because it yields two important measures of learning. 
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The nature of the task is such that eventually there must be an upper 
limit to the number of letters printed per min. If the hyperbola is 
solved for y, we obtain: 





y=b+ (3) 
k 
x— 
negative, approaches zero, and y, the predicted score, approaches } as 
an upper limit. Therefore, the hyperbola furnishes an estimated 
upper limit to score after infinite practice under the specified condi- 
tions. It also yields a measure of the rapidity with which the upper 
limit is being approached. ‘The square root of the absolute value of 
2k is geometrically interpreted as half the length of the major axis of 
the hyperbola. If the major axis is short, the curve is rising rapidly 
and bending sharply; if it is long, the curve is less steep and has a more 
moderate curvature. 

Obviously, other formulas could have been chosen to represent 
the data. Throughout this discussion, statements are based on what 
happened during the first 20 trials. ‘Scores’ on trials after those 
first 20 trials are extrapolations on the assumption that added trials 
would yield scores which follow the hyperbola fitted to the scores on 
the first 20 trials. Had another equation, such as the Gompertz 
curve or the arc cotangent function, been used, an upper limit 
different from b, the upper limit predicted from the hyperbola, would 
have been found. And had a power function, such as the generalized 
psychometric function proposed by Guilford (3), been used, there 
would have been no upper limit at all predicted. Instead, it would 
have been expected that any score whatever could be reached after 
sufficient practice. We attempt to foresee the whole course of learn- 
ing from what has happened during the first 20 trials, and the predic- 
tion depends on the mathematical equation used to smooth the scores 
on those trials. However, constants have uses other than prediction. 
They also give smooth values for the observed scores, and in this 
sense they represent something inherent in the data. Thus, they 
furnish a few values in terms of which the various curves may be 
compared. It is to be expected that when several equations are 
fitted to the same set of data, there will be close relationships between 
constants measuring somewhat the same characteristics of the data, 
whether those equations be theoretically correct or not. ‘This is 
shown in a study carried out by Woodrow (12), who fitted both the 
Gompertz equation and Robertson’s autocatalytic equation to each 
of several hundred individual learning curves. He found that, al- 





As x, the number of trials, increases, the expression , which is 
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though the absolute values of the parameters which represented the 
predicted upper limits of practice were different, there was nearly 
perfect correlation between the values predicted from the Gompertz 
equation and those predicted from the autocatalytic curve. Simi- 
larly, there was high correlation between other comparable constants. 
His conclusion was that the constants in each case represented some- 
thing in the data which they described, and that therefore the two 
sets of constants were measures of the same thing. In view of his 
results, it is likely that constants derived from the hyperbolas used in 
the present study would be closely related to similar constants of a 
more generalized curve. 

The parameters of the hyperbolas which, by a least squares 
method, best fit the data of Table II are given in Table III. Con- 


TABLE III 


PARAMETERS OF Best-Fitrinc HypeRBOLAS 











Group a b k* Vv |2k) 
I —10.17 63.61 — 429.40 29.3 
_— — 7.01 59-74 —271.94 23-3 
II — 5.66 67.31 — 276.67 23-5 
Ill — 7.16 70.40 — 347.38 26.4 
IV — 7.71 74.08 — 433-42 29.4 
V — 7.77 80.68 — 457.36 30.2 
VI — 7.93 85.83 — 512.90 32.0 
VII — 7.50 89.99 — 505.74 31.8 
VIII — 9.34 91.47 — 663.81 36.4 

















* The negative sign of & indicates direction and is unimportant in finding the length of the 
major axis. 
** Based on first twenty trials only. 


stants were not computed for groups having fewer than 20 trials. 
The theoretical curves derived from the hyperbolas are very close 
fits to the obtained data for the 20 trials. Although the theoretical 
curve for Group Ia underestimates scores earned on later trials, 
constants from the first 20 trials were used in order to compare Group 
Ia with the other experimental groups. 

The constant b, which is predicted score after infinite practice, 
rises with increase in rest between trials. It rises at a decreasing rate, 
which may be interpreted as meaning that there is a limit to the 
effectiveness of added rest between trials. The differences among 
the groups in this parameter imply that an infinite number of trials 
under the condition of massed practice would never produce the same 
scores as an infinite number of trials under spaced practice. Further- 
more, whatever accounts for the beneficial effect of the pauses pro- 
duces most of the benefit with pauses of a fraction of a minute dura- 
tion. It is also rather remarkable that the values b are so closely 
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related to duration of rest pause, for there is some overlapping in the 
learning curves obtained under the various distributions of practice. 

Another constant of the hyperbola, namely k, determines the rate 
at which the learning curve approaches its maximum, b. It will be 
recalled that the absolute value of the square root of 2k is geometric- 
ally interpreted as the length of the semi-major axis of the hyperbola. 
When it is long, the curve is rising slowly. From Table III it is 
evident that the length of the semi-major axis is rising slowly, but 
irregularly, with duration of rest interval. This means that the 
longer rest groups are, at any given trial, farther from their predicted 
upper limits than are those groups with less rest. 

The third constant, a, is geometrically interpreted as the vertical 
asymptote of the hyperbola. ‘There will be no attempt to refer it to 
the observed data. 

The learning curves of raw score means appear to be about the 
same for all groups having 45 sec. or more rest. However, constants 
derived from the hyperbolas indicate that groups with longer rest 
intervals are rising slowly to high maximums and that groups with 
shorter rest pauses are rising relatively quickly to low maximum mean 
scores. Throughout the range of durations used here, added rest 
has progressively less effectiveness in increasing mean score. 

These results are similar to those reported from other experiments. 
It is characteristic of studies of distributed practice that any one 
considerable rest period is almost as beneficial as any other, but that 
all such rest periods are much more beneficial than no rest at all 
(massed practice). This is borne out in the data shown in Fig. 1, a 
graph of mean gains. No rest at all shows a mean gain of 24 letters 
written between Trial 1 and Trial 20; 60 sec. shows a gain of 45 letters. 
A rest of 10 sec. produces a gain of 36 letters, which is past the mid- 
point between the other two gains. Therefore, as amount of rest 
between trials increases, gains between two designated trials fail to 
keep pace with the increased rest. That is, a rest of a few seconds 
produces most of the maximum possible gain. 

Snoddy (10) reported a series of experiments involving a modified 
form of the classical mirror-drawing task. His five groups of Ss 
practiced at intervals varying from no rest at all, through intervals 
of 30, 60, and 120 sec., and 24 hours. He found a great difference 
between the zero rest group and that which had 30 sec. rest. The 
differences among the four rested groups were slight, but directly 
related to the duration of the inter-trial rest. His experiment 
differed from the one reported in this paper, however, not only in 
kind of task, but also in length of rest intervals used. For mirror- 
drawing, a 24-hour rest produced higher mean scores than a two- 
min. rest. It seems unlikely that a pause of two min. or of 24 hours 











198 MARY J. KIENTZLE 


would be more beneficial than one of 60 sec. in writing the inverted 
alphabet. ‘Thus, it may be that the limit of effectiveness of added 
rest between trials is a function of the task. 

A study reported by Lorge (7) points to this conclusion. Lorge’s 
Ss practiced at intervals of zero rest, one min., and one day. He 
obtained data under those distributions on all the following tasks: 
the stabilimeter, mirror reading, letter-for-letter code substitution, 
and memorizing nonsense numbers. For nonsense numbers and 
mirror reading, he found that added rest between trials produced 
higher mean scores. For code translation, however, he found the 
one-day group less proficient than the one-min. group. He attributed 
this difference to the Ss’ forgetting the code over the longer interval. 
lor the most part, his experiment agrees with the present one in find- 
ing that added rest results in higher scores. But, on one task, he 
found a whole group hindered rather than helped by an increased 
rest interval. 

There has been some question regarding the permanence of the 
effects of distributed practice. Work by Gentry (2) indicates that 
with letter-for-letter code translation and mirror reading, the effects 
of massed and of spaced practice are rather transitory in that changes 
from massed to spaced and vice versa produced rapid changes in 
mean learning curves. In other words, after a very few trials under 
massed practice, a group which has had spaced practice will ap- 
proximate the mean scores of a group which has had massed practice 
all along. Similarly, a change from massed to spaced practice brings 
an immediate rise in mean score. 

Such experiments do not indicate that the two conditions neces- 
sarily would have eventually produced the same upper limit of score. 

An experiment of Hovland (5) suggests that the differential effects 
of learning under massed and distributed practice are more lasting. 
His Ss learned lists of 12 nonsense syllables to a criterion of seven 
correct syllables by distributed and by massed practice. The order 
of learning was so balanced that practice effects and the like were 
ruled out. At intervals of 6 sec., 2 min., 10 min., and 24 hours after 
the original learning, the Ss relearned under massed practice to the 
criterion of one perfect recitation. Hovland found that, a short 
time after learning, retention under the two conditions was about 
equal. However, after 10 min. had elapsed, the distributed condition 
gave higher retention. At the end of 24 hours also there was superior 
retention for the condition of distribution. 

Offhand, the results of Gentry and Hovland look contradictory. 
In Gentry’s experiment, as was mentioned above, mean scores 
changed quickly as conditions changed from massed to distributed 
practice. In Hovland’s experiment, retention, as measured by 
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number of syllables correctly recalled, was higher for the spaced condi- 
tion even after 24 hours. Furthermore, when the Ss, at varying 
intervals after the original learning, continued learning to a criterion 
of 12 correct syllables, the lists learned originally under spaced prac- 
tice to a criterion of seven syllables required fewer trials than did 
those learned under massed practice. If a situation analogous to 
Gentry’s obtained with Hovland’s data, one would expect that it 
would take just as many trials to finish learning material initiated 
under distributed practice as it would for material originally learned 
under massed practice, for the criterion was seven correct syllables 
under each condition. This discrepancy between Gentry’s and 
Hovland’s results may exist largely because of the different tasks and 
experimental designs used. 

Neither Hovland’s results nor those from the inverted alphabet 
support the following statement of Lorge (7): 


“If regular distribution (of practice) is beneficial at all, the group practicing under the 
condition of distribution will reach a plateau or limit of practical achievement in fewer units of 
practice than the group working under the condition of massing. . . . Hence observed superiority 
in favor of a distributed practice group will become less until eventually such superiority becomes 
zero. 


As was mentioned in the preceding paragraph, Hovland’s Ss learned 
to the same criterion under both massed and distributed practice. 
Despite the fact that more trials were required to reach the criterion 
under massed practice, retention was superior under the spaced 
practice condition. This would imply that the difference between 
the two conditions is more than a mere difference in immediate 
achievement, and hence would lead one to doubt that superiority of 
the distributed group would vanish after a great number of trials. 
With the inverted alphabet, 70 trials under massed practice for 
Group la failed to produce the score attained in 20 trials by Group 
VIII. Ifthe parameters 5 may be used to predict average score after 
infinite practice, then one would expect that the massed groups 
would never attain the upper limits of the rested groups. The other 
constant of the hyperbola, &, indicates that at any particular trial, the 
longer rest groups are farther away from their predicted upper limits 
than are the shorter rest groups. This, too, fails to support Lorge’s 
statement that the distributed practice group requires fewer trials 
to reach a “‘limit of practical achievement.” Although the question 
can be settled only by carrying the learning curves far beyond the 
segments observed here, available data fail to indicate that massed 
and spaced practice will eventually yield the same results. 

The aim here has been to find the relationship between mean score 
and the two experimental variables, number of trials (x,) and duration 
of inter-trial rest (xz). Subject variables have been held constant, 
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presumably, by studying the mean practice curves from comparable 
groups of Ss. In terms of the basic equation (1), which was intro- 
duced earlier, mean score is a function of the two experimental vari- 
ables, or, 


Y = f(x, xa). (4) 


The mean score, Y, has been expressed as a function of x, through the 
use of a hyperbola: 


k 


(x — a) 


Fak+ (5) 
The parameters , k, and a of the hyperbola show the relationship 
between mean score Y and number of trials, x, The parameters 
themselves, however, are functions of xz, duration of rest interval 
between trials. Therefore, the equation for Y in terms of the two 
experimental variables becomes: 


Po(xa) 
xt — ®3(xa) (9) 


The ® functions are not defined mathematically; their values are 
given in Table III, which lists the constants of the hyperbolas as 
functions of duration of rest interval. 

Mean score is a negatively accelerated function both of number of 
trials and of amount of rest between trials. That is, mean score 
rises more and more slowly with added trials. Similarly, score in- 
creases rapidly as rest between trials increases from zero to rather 
short intervals, but it increases relatively slowly when rest intervals 
change from one fairly long duration to another. Thus, with both 
experimental variables, there is a limit to the effectiveness of changed 
conditions in improving score. 


Y = ,(xa) + 





SCORE AS A FUNCTION OF SUBJECT VARIABLES 


The dependence of score on experimental variables is only one part 
of the relationship between score and its determining conditions. 
There still remains the question of how score depends on those deter- 
miners which account for variability among individuals. Two as- 
pects of group variability will be of concern here. The first is the 
relationship between extent of group variability and the two experi- 
mental conditions; the other involves explaining that variability as a 
function of its determiners. 

Throughout the present discussion, the standard deviation of the 
distribution will be used as the measure of group variability. It 
has been chosen in preference to the ‘relative’ measures of variability, 
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such as the coefficient of variation, because the latter require assump- 
tions concerning the absolute zero of the scores. 

Obviously, the standard deviations are affected by the reliability 
of the scores. Other things equal, if individual practice curves show 
many inversions from trial to trial, then the standard deviations tend 
to be high; they reflect not only differences among Ss, but also varia- 
tions of an individual’s scores from a smooth course of learning. In 
effect, then, score reliability may be defined as closeness of conformity 
to a smooth curve of learning. In the absence of such a measure of 
reliability, the correlations between the first two and last two trials 
of each group are given in Table IV as approximations to the reli- 


























TABLE IV 
CORRELATIONS BETWEEN THE First Two AND THE Last Two TRIALS 
Reliability 

Group Duration of Rest Number of Trials 

Initial Final 
I O sec. 20 853 775 
Ia O sec. 70 .800 .788 
II 3 sec. 20 884 854 
III 5 sec. 20 881 871 
IV IO sec. 20 745 .go8 
V 15 sec. 20 829 871 
VI 30 sec. 20 .J06 922 
VII 45 sec. 20 738 842 
VIII 60 sec. 20 .796 895 
Villa 60 sec. (tallies) 10 Bol | 871 
IX go sec. | 15 858 | 847 
xX | 7 days | 12 846 877 





ability of individual scores at the beginning and at the end of practice, 
respectively. There is no systematic difference between groups in 
reliability of initial and of final scores as measured by the correlations 
given in Table IV. Even though it is probable that they under- 
estimate the true reliabilities, the correlations in Table IV are high 
enough to warrant the assumption that true group variability 


(Circe = Code V711) is not seriously different from the observed vari- 
ability. It follows, therefore, that our measure of group variability, 
that is, the observed standard deviation, shows chiefly differences 
among individuals and not irregularities of individual learning curves. 


Group Variability as a Function of the Experimental Variables 


Table II, which gives the mean and standard deviation for each 
group at each trial and also the variability at each trial as a pro- 
portion of the initial variability, reveals a tendency for both the mean 
scores and the standard deviations to increase with added practice 
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and added rest between trials. It is not unusual to find that large 
standard deviations accompany large mean scores. In the situation 
at hand, however, it is possible to discern between variability and 
spacing a relationship which is not entailed simply by the higher 
mean scores produced by spacing. For Group I and Group la, the 
variability remains unchanged during the first 20 trials even though 
the mean scores increase. In the rested groups, both the means and 
the standard deviations increase with practice, but the relationship 
between the two depends on the duration of the inter-trial rest. ‘Table 
V shows the correlations and the regression lines predicting variability 
from mean score for each experimental group. 


TABLE V 
REGRESSION LINES OF GOdig UPON ACCOMPANYING MEANS 
Group Mo Regression Equations 
] — .336 o = —.0230 M+9.20 
Ia* — .203 a= —.0160 M+8.49 
II .079 a= .0065 M+9.93 
Ill 343 o= .0268 M+7.75 
IV 852 o= .0945 M+4.54 
V .780 o= .0552 M+8.31 
VI .888 o= .0662 M+8.07 
VIl 616 a= .0436 M+6.94 
VIII .692 o= .0391 M+7.61 
Villa .g06 a= .0914 M+4.63 
IX 878 g= .0558 M+5.93 
X 925 g= .1690 M+3.58 


* Based on first 20 trials only. 


Although there are some inversions in the regression coefficients, 
probably because of sampling errors, the trend is clearly for the 
regression to be steeper for the longer rest interval groups. That is, 
variability rises more rapidly with spaced practice than with massed 
practice, even though the size of the mean score is taken into account. 
Furthermore, within the limits of this experiment, the longer the rest 
between trials, the greater is the rise in variability. This rise seems 
to depend on a change of task rather than on a rest pause alone. 
Group VIIla spent its ‘rest’ interval performing another task; 
nevertheless, group variability rose steadily with practice. 

One hypothesis which might explain the increased variability 
with spacing is that whatever causes the increase occurs during the 
rest intervals and not during the trials themselves. Such a hy- 
pothesis does not account for the fact that Group Ia, which had 70 
trials with no rest, showed increased variability at the later trials. 
It is possible that there was a cumulative effect from the slight rest 
of three sec. which was unavoidably introduced when the Ss discarded 
one sheet of the test blank and went to the next one. Another 
possibility is that the effect of massed practice is not to keep vari- 
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ability constant, but is rather to keep it from rising rapidly. It is 
plain that added rest is not equally beneficial for all individuals. On 
Trial 11, Group IX has a mean score of 60.3 and a standard deviation 
of 9.29; on the same trial, Group X has a mean of 59.8 and a standard 
deviation of 13.29. This indicates that there are individuals in Group 
X whose scores exceed all those in Group IX and also that there are 
individuals in Group X whose scores fall below all those of Group IX. 
Graphs of the distributions of scores, which are not shown here, 
confirm this observation. : 

Historically the problem of the effect of practice on variability 
has been connected with the question of the relative importance of 
heredity and environment in producing individual differences. Most 
of the facts of various investigations agreed: with added trials in 
amount-done-per-unit-time scores, the standard deviation of the 
distribution rises, as does the mean. However, two investigators 
could take the same data, use different measures of variability, and 
arrive at contradictory conclusions. Ordinarily, the standard de- 
viation does not rise as rapidly as the mean; therefore, one investigator 


o ee — 
could assume that — measured variability and conclude that individ- 


M 


uals become more alike after practice, while another, looking at the 
rise in o itself, could find the group more variable after practice. 
Experience with the inverted alphabet indicates that the relationship 
between practice and variability depends on the conditions under 
which the practice occurs. Incidentally, it should be noted that the 
unchanged variability during the first 20 trials of Group I and Group 
Ia contributes nothing to the study of heredity and environment. 
Although the standard deviation does not increase, the correlations 
between scores on Trial 1 and Trial 20 are only in the .40’s. That is, 
as learning progresses, the ranks of individual Ss change considerably. 

A few previous experimenters have published means and standard 
deviations of groups of scores obtained under varied distributions 
of practice. Their results do not agree with the present study as to 
the dependence of the relationship between mean and standard 
deviation upon spacing. 

Lorge (7) and Gentry (2) report means and standard deviations 
at each trial for the tasks of mirror reading, memorizing nonsense 
numbers, and code substitution. Although their mean scores show 
more improvement with spaced practice than with massed, the rela- 
tionship between mean and standard deviation, contrary to the pres- 
ent findings, remains the same, regardless of the learning condition. 
However, their ‘massed practice’ groups actually had a few seconds 
rest between all trials. Work on the inverted alphabet shows that a 
few seconds may be long enough to produce changed variability. 
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Bell (1) and Hilgard and Smith (4) report data derived from the 
Koerth pursuit rotor. ‘They too found for the most part an increase 
in variability with added trials. However, the relationship between 
the mean and standard deviation failed to vary with spacing of 
practice. Here again, very short rest intervals were not used. 

It might seem that memory experiments would contribute to the 
question of the effects of spacing on group variability. However, 
most studies of delayed recall are based on learning lists of nonsense 
syllables, and the standard deviation of the number retained is 
usually affected by the closeness of scores to complete retention or to 
zero. One experiment, that of Shurrager (9), uses scores derived 
from absolute scaling and thus overcomes this difficulty. Insofar as 
a correspondence may be assumed between the pause before recall 
and the pauses between trials of the inverted alphabet, her findings 
agree with those of the present experiment. For a fixed mean scale 
score, she found greater variability under delayed recall than under 
immediate recall. 

To summarize, it may be said that variability almost universally 
rises with added trials, when the score is amount-done-per-unit-time. 
For the inverted alphabet, as well as for other tasks, the increased 
variability accompanies increased mean scores. Moreover, in the 
present study, there was evidence that spacing itself affected vari- 
ability, and that its influence was not attributable merely to the 
higher mean scores produced by the spacing. ‘There was almost no 
change in variability under massed practice, and there was a very 
marked change when seven days’ rest was interpolated between trials. 
And yet, the mean scores rose under all conditions of distribution. 
As was pointed out in the preceding paragraphs, other experimenters 
have not reported this phenomenon. However, most other studies 
differ from the present one in the proximity of ‘massed’ practice to 
zero rest between trials, and hence neither contradict nor confirm the 
finding. 


Relationship between Experimental Variables and Subject Variables 


Within limits, both added trials and added rest between trials 
result in higher mean scores. In the immediately preceding section, 
it was shown that group variability depends on the number of trials 
and also on the duration of the inter-trial rest. That is, the extent 
of the group variability, which is a measure of the effect of subject 
variables, depends on the two experimental variables. The subject 
variables themselves, which account for the individual differences, 
were investigated by factor analyzing correlations among scores on 
Trials I, 2, 5, 10, 12, 15, 19, 20, 25, 45, and 65. 
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Correlations between any two trials proved to be functions of the 
ordinal numbers of the trials, but not of the amount of rest between 
them, and hence, not of the differences between the two means. 
Since such correlations were independent of size of mean gain between 
the trials, it is unlikely that changed difficulty of task with added 
trials produced spurious changes in the correlation coefficients. 
Furthermore, statistical tests showed these correlations to differ 
from one another only by errors of random sampling. Since the 
variation of correlations between any two trials was not only inde- 
pendent of duration of rest interval, but also insignificant in extent, 
averages of the correlations were used in the factor analysis. ‘Table 
VI lists the average correlations; it also gives asummary of the centroid 
factor analysis. 

Two orthogonal factors with inversely related loadings suffice to 
explain the correlations of Table VI. The similarity of correlation 
matrices does not guarantee the identity of the factors from one group 
to another. This point and the proper interpretation of the factors 
can be decided only by analyzing scores on these 11 trials for each 
experimental group as part of a battery of tests of established factor 
content. However, the variability of Trial 1 should have the same 
explanation, regardless of the condition of distribution. Therefore, 
the assumption of identical factors for all experimental groups does 
not seem untenable. 

If it may be assumed that the same factors account for the correla- 
tions of each group, then we are forced to the conclusion that added 
rest and added trials function in different manners to increase score. 
Rest between trials seems to affect group variability, and may possi- 
bly be the chief cause of increased variability with added trials. 
Although there are wide differences in the sizes of the means and 
standard deviations at any particular trial for the several groups, 
correlations between any two particular trials, as already stated, are 
constant from group to group. This indicates that giving rest be- 
tween trials does not alter the factor pattern of the test. If the 
factors may be identified with ‘abilities,’ then it may be concluded 
that the same abilities enter the test at any given trial even though 
the absolute value of the score and the variability of the group are 
different. On the other hand, it is number of trials and not the 
spacing between trials which determines the relative importance of 
the various abilities in explaining individual differences in inverted 
alphabet scores. 

The effect of practice on factor structure has been observed by 
Woodrow (11). He found that variability at the beginning of prac- 
tice was not to be attributed to the same factors as variability at the 
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TABLE VI 


Factor ANALYSIS OF AVERAGE CORRELATIONS BETWEEN TRIALS 


A. Average intercorrelations between trials 


I rial I 2 5 10 12 15 19 20 25 45 65 
I 828 .628 .563 .541 .498 .505 -477 +387 .300 8 .225 
2 737 646 619 .604 .584 +547 +538 .340 -432 
5 807 .786 .797 .771 .746 605 .434 = -421 
10 827 .844 838 .794 -.777 -693_~— -753 
12 830 .790 802 .805 .692 614 
1S 863 .841 .818 .737 = -704 
19 865  .847 693 .727 
20 840 .675  .646 
25 ‘720-773 
45 691 
65 
B. Centroid factors C. Distribution of residuals after removal 
of second centroid factor 
Trial Ie Il. h? Interval Frequency 
I 635 — .604 .768 .0§ to .069 5 
2 .736 —.546 840 .03 to .049 6 
5 828 —.153 -709 OI to .029 5 
10 92! 075 854 —.O0I to .009 17 
12 894 061 803 —.03 to —.OlI 8 
15 923 162 .878 —.05 to —.031 8 
19 .Q17 .164 .868 —.07 to —.051 3 
20 .890 -I7I 821 —.09 to —.07I I 
25 874 255 829 ee hae do a 
45 -738 -304 637 
65 -743 304 644 —.1§ to —.131 2 
Total 55 
Mean — .007 
S.D. 042 
D. Transformation equations E. Rotated factors 
Trial I; lh 
I,=.5017 1-—.8650 I1, I 840 .246 
11,=.8650 1-+.5017 [1 2 841 .363 
5 547 639 
10 -397 835 
12 395 804 
15 323 879 
19 318 875 
20 .298 856 
25 218 884 
45 -107 -790 
5 110 795 


end of practice. In other words, he found the relative importance 
of the different subject variables to be a function of practice. 

The exact mathematical relationship between score and subject 
variables has not been established here, but it is possible to make 
some definition of the general function. ‘The general expression for 
the relationship between score and its determining conditions has 
been stated as follows: 


y = f(x, Xe, Xe, « . . Xe). (1) 
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For the purpose of the present experiment, the problem was restated: 


Yy = f(x1, X dy S), (7) 


in which Y is the raw score of a subject, x, is number of trials, xg is 
duration of rest between trials, and S is the totality of subject 
variables. 

In an earlier section, the problem was to find the relationship 
between the two experimental variables and mean score. That 
relationship, ®(x;, x4), predicts mean score from number of trials and 
amount of rest between trials. Any individual score Y may be 
expressed as a deviation from a mean score in terms of the standard 
deviation of the distribution of scores as follows: 


Y = Y a Odis %, (8) 


in which Y is the mean score, oais is the standard deviation of the 
distribution, and z is the individual’s standard score. Since the S 
variables account for deviations from ®%, functions involving the S 
variables should be added to ®. Therefore, equation (7) takes the 
form: 


Y = ®(x,, xa) + Fi(S), (9) 


in which F;(S) is defined only as a function which involves S variables. 

For present purposes, let us suppose that the two factors, /; and 
II,, of Table VI account for all the variance of the trials. Equation 
(9) may then be defined further: 


Y = P(x, Xa) + Fh, [1;) CO dis, (10) 


in which J, and JJ, are measures of the individual S’s standard scores 
in the two ‘abilities. The basic equation of factor analysis states 
that a standard score, z, in any task is a linear function of the in- 
dividual’s abilities and of the factor loadings of the task. ‘Therefore, 
2 in equation (8) may be expressed as: 


F2(x,) I, oa F3(x:) II, (11) 


for the factor loadings of the task are dependent on the number of 
trials, but apparently independent of duration of rest between trials. 
F’o(x) and F3(x,) represent the factor loadings on /,; and J/J,; their 
values are given in Table VI, Part E. 

Therefore equation (9) may be further defined: 


Y = ®(x,, Xa) + Gail Fo(x,) Li + F(x.) 111). (12) 


The value of oa;, is a function both of number of trials and of duration 
of inter-trial rest. Therefore, the complete dependence of an in- 
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dividual’s score on its determining conditions may be expressed as 
follows: 


Y = B(x, xa) + Fi(xe, xa) CP ole) Li + F(x.) Ili], (13) 


in which Fy(x;, x4) expresses gais as a function of number of trials 
and of duration of rest. The expression ®(x;, xa) is represented by 
the hyperbolas fitted to the mean practice curves and by the relation- 
ships between the constants of the hyperbolas and duration of rest 
interval. 


CONCLUSIONS AND THEORETICAL IMPLICATIONS 


Learning curves for printing the inverted alphabet were obtained 
under 11 different spacings between the one-min. trials. The fol- 
lowing conclusions were reached: 


1. Groups with distributed practice gain more than groups 
with massed practice. 

2. On the basis of hyperbolas fitted through the first 20 trials 
of groups with rest intervals varying from no rest at all to 60 
sec. rest, predicted upper limits of mean learning curves are a 
rising but negatively accelerated function of duration of rest 
interval. 

3. Practice causes variability to rise more rapidly in the case 
of rested groups than in the case of groups with continuous 
practice. This change in variability is not incidental to the size 
of the mean scores. Furthermore, in view of the rise in standard 
deviation for Group VIIIa, it may be concluded that a change of 
task has the same effect as rest, so far as group variability is 
concerned. 

4. Correlations between scores on designated trials are de- 
pendent on the ordinal numbers of the trials but not on the amount 
of rest between trials. 

5. Two orthogonal factors with inversely related loadings 
suffice to explain the intercorrelations among scores on Trials 1, 
2, 5, 10, 12, 15, 19, 20, 25, 45, and 65. Since Trial 1 should have 
the same determiners for all conditions of distribution, it is pos- 
sible that the two factors are the same for all the experimental 


groups. 


This experiment was planned in order to discover the relationship 
between score on the inverted alphabet and some of its determiners. 
Although the solution is far from complete, the basic general state- 
ment of the dependence of score on determining conditions, 


¥y = f(x, X2, X38, + + Xn), (1) 
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has been partially defined. Complete definition for the score of an 
individual requires a statement not only of the determining condi- 
tions for the mean scores under various experimental conditions, but 
also a statement of the conditions determining variations from the 
mean scores. 

On the assumption that a rectangular hyperbola may be used to 
express the relationship between mean score and number of trials 
under each distribution of effort, equation (6) was written. Devi- 
ations from the mean scores were expressed by the F functions of 
equation (13). Therefore, an individual’s raw score Y may be 
written in this manner as a function of its determining conditions: 


Po(x4) 
2 ®3(xa) 





Y = @,(x%a) + + Fy(x%4, xa) CF o(xe) D1 + F(x) II,}. (14) 
Thus, the five conclusions listed above have been summarized in 
terms of the basic problem of predicting score from its determining 
conditions. 

The general picture is that added trials and added rest between 
trials both result in increased scores. ‘These increases are at first 
rapid and then slow. The mean gains between Trial 1 and Trial 10 
are almost double the gains between Trial 10 and Trial 20. Simi- 
larly, a change from no rest to one of 45 sec. doubles the gain between 
the first and tenth trial, but a change from 45 sec. to seven days’ rest 
produces no further increase in gain. Likewise, there is a rapid shift 
in importance of the two factors, which are measures of subject 
variables, in determining the variance of scores. 

In view of this relatively quick shift both in the effect of the ex- 
perimental variables and of the subject variables, the total function 
which describes the relationship between score and its determining 
conditions, that is, equation (14), must be one which cannot increase 
without limit. 

One of the chief current theories of the effectiveness of spaced 
practice is that proposed by Hovland (§) and Hull (6). Essentially 
their theory looks upon learning as a difference between excitatory 
and inhibitory processes. Every repetition of a task creates added 
reinforcement of learning and also some inhibitory processes. Both 
sets of factors die out with time, but the inhibitory tendencies die 
out more rapidly. The theory states that added rest between trials 
permits dissipation of inhibitory factors which arise at the time of 
making a response. This theory of differential forgetting has been 
substantiated by work on rote learning (5) and on conditioning (6). 
The data from the present experiment do not contribute directly to 
the theory, but they do raise problems related to points covered by 
it. Hovland, on the basis of his studies of rote learning of nonsense 
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syllables, concluded that inhibitory effects disappear in a few seconds. 
As the theory would apply to the inverted alphabet, it would state 
that failure of groups with less rest to gain as much as rested groups 
indicates that there is more inhibition present with massed practice. 
Fig. 1 might be taken to represent the increase in freedom from 
inhibitory potential. The mean gains shown in Fig. 1 do not con- 
tradict the statement that inhibitory effects die out rapidly. Earlier 
in the paper, however, it was observed that group variability in- 
creased more rapidly with spaced practice than with massed practice. 
The Hull-Hovland theory has no development which would explain 
this change. If the theory is to be extended to cover the facts of 
increased variability with spacing, then it follows that there are wide 
individual differences in the rate of dissipation of inhibitory tendencies 
with the passing of time. This must be true, for the chief source of 
differences among the groups is held to be the differential dying out of 
inhibition. Since for the first 20 trials of the massed practice groups, 
variability did not rise at all, one might conclude that there is wider 
variation among individuals in the rate of dissipation of inhibition 
than there is in the rate of growth of reaction potential. The theory 
holds that over very long time intervals, reaction potential also 
diminishes. Apparently some modification of the time requirement 
for the dissipation of inhibitory tendencies must be made. Since 
Group X has the greatest variability at Trial 12, but not the highest 
mean score for that trial, it follows that some of the individuals in 
this group are continuing to lose inhibitory potential after seven days, 
and that others are losing some of their reaction potential after that 
period. 

The differential forgetting theory does not attempt to explain 
intercorrelations among scores at various trials. From the stand- 
point of factor analysis, different sets of ‘causes’ operate at the 
beginning and at the end of practice. One might hypothesize that 
the factors obtained in Table VI represent the growth of positive 
tendencies (reaction potential) and the dissipation of inhibition with 
practice. The original theory assumes that these two sets of proc- 
esses are inversely related, as are the loadings of the two factors. 
Such an identification of the factors with the concepts of inhibitory 
and reaction potentials would require that inhibitory potentials die 
out according to some function such that trial intercorrelations are 
unaffected from one experimental group to the other. 

In conclusion, it may be said that the present data do not contra- 
dict the Hull-Hovland theory of the effectiveness of distribution of 
effort. On the other hand, the theory fails to explain the relation 
between individual differences and varied distributions of practice. 
The results obtained in the present study indicate that learning 
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theory might profit considerably by further study and analysis of 
such changes. 
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THE INFLUENCE OF SIMULTANEOUS HUNGER AND 
THIRST DRIVES UPON THE LEARNING OF TWO 
OPPOSED SPATIAL RESPONSES OF THE 
WHITE RAT 


BY HOWARD H. KENDLER! 


State University of Iowa 


Hull’s (3) recent formulation of a theory of primary motivation 
is to a certain extent guided by his (2) and Leeper’s (5) discovery 
that rats are capable of acquiring differential reactions based solely 
on internal conditions. Using a single choice maze, Hull controlled 
the drive states of his subjects so that on some days a given animal 
was motivated for food but satiated for water, while on other days 
this regimen was reversed. The response of turning right at the 
choice point on ‘hungry’ days was rewarded by food, while response 
to the left on ‘thirsty’ days led to water. As a result of this training 
procedure, the animals gradually acquired the ability to react in 
accordance with the dominant drive. In an independent study by 
Leeper (5) this finding, that organisms can respond differentially to 
an identical objective situation on the basis of different drive states, 
was confirmed. 

How is this phenomenon to be explained? Hull (3) points out 
that we cannot assert that the hunger drive stimulus becomes con- 
nected to turning reactions to the right while the thirst stimulus 
acquires power to elicit left turning responses. If this were true, 
“the animal would when hungry be impelled to turn to the right continuously when in its cage or 


where-ever it happened to be, as well as at the choice point in the maze. Such an analysis is 
quite obviously contrary to fact” (3, page 250). 


In order to interpret this phenomenon adequately, Hull concludes 
that it is essential to utilize the principle of patterning. This process 
enables an organism to respond uniquely to a combination of stimull 
as compared to the reactions elicited by the singular presentation of 
the stimulus components of which the compound is composed. At 
the basis of this mechanism is the principle of stimulus trace inter- 
action, i.e., the principle that the hypothetical traces arising from 
stimuli impinging upon the sensorium interact prior to the occurrence 
of the response. Consequently ‘afferent impulses’ arising from stim- 


1 Appreciation is expressed to Professor Kenneth W. Spence for his valuable suggestions 
and advice. 
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uli vary with the context within which they are presented; e.g., the 
receptor impulse arising from S, presented in conjunction with Sz is 
different than when §S, is presented along with S;3.? In terms of the 
experiment under consideration the ‘receptor discharges’ arising from 
the stimuli at the choice point interact with the drive stimuli so that 
the stimulus pattern of choice point and thirst stimuli tended with 
successive reinforcements to elicit a left turning response, while the 
pattern of choice point and hunger stimuli finally became connected 
to the right turning response. 

It is the purpose of this paper to present experimental evidence 
bearing upon the adequacy of the above line of reasoning. An 
important element in Hull’s analysis is that there are two patterns 
of stimuli, one of which becomes conditioned to the left response and 
one to the right. These stimulus patterns have in common the cues 
from the choice point and differ from each other in terms of their 
drive components. The question now arises as to whether this 
discrimination behavior can be learned if both hunger and thirst 
drives are present simultaneously during the training series and one 
spatial response (e.g., left) is always and only rewarded with water 
and the other always and only with food. That is, would subjects 
trained in this manner be able to respond appropriately in the maze 
when motivated for either one of the goal objects? 

Strict adherence to the above analysis, with no other factors 
considered, would lead one to expect failure of such discrimination 
behavior on test trials with single motivation, for the reason that 
there is only a single stimulus pattern to which both spatial responses 
become conditioned about equally strongly (assuming the food and 
water rewards to be equal). As both hunger and thirst drive stimulus 
components are present in this stimulus pattern, each becomes 
associated with both spatial responses. So long as one does not 
postulate any selective factor to be operating in the associative 
mechanism, the hunger drive should be just as strongly conditioned 
to making the turn leading to water as to making the turn leading 
to food. Thus, when either one or the other of the drive stimuli is 
presented singly during a test series we should expect the animals 
to respond to the right or left in a random manner, or in other words, 
they should not exhibit the appropriate discrimination behavior with 
respect to the goal objects. 


EXPERIMENTAL PROCEDURE 


Subjects 


The Ss for this experiment were 20 female white rats (ages 66-84 days) from the colony 
maintained by the Psychology Department of the State University of Iowa. 





2 For a more adequate treatment of this principle the reader is referred to Chapter 13 of 
Hull (3). It is hoped that the above brief exposition provides sufficient understanding to the 
reader to comprehend the analysis of the Hull-Leeper phenomenon. 
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Apparatus 


A single-choice T-maze, the floor plans and dimensions of which are shown in Fig. 1, was 
used, 
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Fic. 1. Ground plan of apparatus. S.B.—Starting Box; D1, Dz, D3, D4, Ds5—Doors; 
C1, C2—Curtains; GB1, GB2—Goal Boxes. 


Walls and floors were constructed of pine board, 54 in. by 3% in. so that the width of the maze 
path was 4in. ‘The maze was covered with hardware cloth. The doors were of sheet metal lined 
with felt, so that when manipulated in a vertical direction by means of strings, they were practi- 
cally noiseless. £ was able to control all the doors from behind a screen. Curtains were hung 
before each goal box in order to make it impossible for the animal to see what was present in the 
goal box prior to its choice. Both goal boxes contained sheet metal panels into which drinking ori- 
fices similar to the ones in the home cages of the animals could be inserted. These panels could 
be hidden by placing wooden guards in front of them. The left alley of the maze was painted 
black while the right alley remained unpainted. The left half of the approach alley was also 
painted black and the left side remained unpainted. The differentiation was complete, in- 
cluding walls as well as the floor. The left half of D1 and Dz, the whole of D3 and Ds were 
painted black. C1 was black while C2 was white. The purpose of this differential treatment 
was to emphasize the respective stimulus cues and thereby facilitate discrimination. 


Preliminary Training 

After each animal had been handled from three to five min. daily for approximately a week, 
a two-day preliminary training period in the maze was begun. 

Day 1. Adjustment to the apparatus.—All doors were raised and the animal was placed at 
the beginning of the maze and was permitted to explore freely for one hour. During this time 
both drinking panels were covered by their respective guards and no food was in the maze. The 
S’s initial choice was recorded. 

Day 2. Test for position habit.—The animals were given two trials under the same maze 
conditions as the previous day, with the exception that D1 was lowered and S.B. (see Fig. 1) 
was used as a starting compartment. The animal was placed in S.B. and when it approached 
D1, that door was raised. ‘Throughout the rest of the experiment this procedure was followed. 
Doors were lowered after animals passed through them, thereby making retracing impossible. 
The S’s choices were recorded, and any animal that chose the same side three times (including 
the initial choice of Day 1) was discarded. However only one S was eliminated from the experi- 
ment on this basis. 


Training Series 
During the training series the animals were motivated by both hunger and thirst. Water 
was present in one goal box, while food (six biscuits of Purina Dog Chow) was placed in the other. 
Half of the animals found food in Goal Box One and water in Goal Box Two, while for the other 
10 animals the situation was reversed. 
Control of the motivational state of the Ss provided some difficulty. It was desired to have 
the drives for food and water approximately equal in strength. Preliminary tests had revealed 
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the fact that, when both drives were present simultaneously, an animal’s choice between a path 
which led to water or a path which led to food was not only a function of the relative time of 
deprivation, but also a function of which of the two substances (food or water) the animal had 
received last. Thus, with simultaneous feeding and watering, if the animals finished their food 
and then drank water, a majority of their choices on the following day would be to food. If they 
drank their water and then ate their food the opposite would be true. The procedure which 
produced the most even distribution of choices between food and water was one which allowed the 
animal to eat two-thirds of its daily biscuit (six gm.) and then permitted it to consume all the 
water it desired. When the animal was finished drinking, the water bottle was removed, and 
the animal was given the remaining third of the biscuit (three gm.). The amount of time required 
for the consumption of the food and water was approximately 30 min. and it occurred 21 hours 
prior to the experimental session. 

The training series consisted of four trials daily for seven days. The initial run of each day 
was a free choice, the second was a forced trial to the side opposite that chosen on trial one, the 
third was a free choice, while the fourth was a forced run to the side opposite that chosen on trial 
three. Thus, the animals had equal experience with the contents of both goal boxes. Forced 
trials were accomplished by lowering Door Three or Door Four. 

The animals were allowed to eat in the food box for 20 sec. and were permitted to take ap- 
proximately 10 swallows in the water box. Food was placed behind the water goal box and water 
was placed behind the food goal box in order to minimize any possible olfactory cues. 


Test Sertes 


The test or critical trials consisted of one daily trial for four successive days, beginning on the 
day following the last day of the training series. The maze conditions were the same as those 
during the training period. 


On the first and fourth day of the test series the Ss were motivated for one goal object (food 
or water), while on the second and third day they were motivated for the other goal object. This 


abba order was utilized to avoid any possibility of building up an alternation pattern of responding 
on successive test trials. 


The animal’s choice was final once it passed Door Three or Four. If it went to the side for 
which tt was motivated, it was allowed to consume the goal object to the same degree as was 
allowed during the training trials (10 swallows of water, 20 sec. of eating food). If the animal 


made an error by going to the goal box for whose contents it was satiated, it was allowed to stay 
in the goal box for 30 sec. 


Since many animals had acquired preferences for one of the two alternative sides, the last 
four free choices on the sixth and seventh days of training were noted, and half of the animals 
were motivated, on the first test trial, against the side for which they had a preference, while for 
the other half the opposite was true. Thus, on any one test trial the preference factor was 
approximately equated. 


RESULTS 


The results are expressed in terms of correct responses during the 
test series of four trials; a correct response being choice of the side 
of the maze which contained the goal object for which the animal 
was motivated on that day’s trial. For the first test trial 13 animals 
were motivated for water, while seven were motivated for food. 
For the next two test trials this proportion was reversed, i.e., 13 were 
motivated for food and seven for water. The distribution was the 
same for the fourth test trial as it was for the first. 

The results are indicated in Table I. 

Fig. 2 indicates the percent of choices to the side containing the 
goal object appropriate for the motivation of Test 1. The results 
of the first seven days represent an average percent of the two free 
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TABLE I 


NUMBER AND PERCENTAGE OF CorRECT RESPONSES TO Foop AND WATER 
FOR EACH TRIAL OF THE TEST SERIES* 

















Number and Percentage of Correct Responses 

Trial To Water To Food Total 

N % N % N % 
' 

I 12 92 5 71 17 85 
2 7 100 8 62 15 75 
3 7 100 I! 85 18 go 
4 13 100 5 71 18 go 
Total 39 98 29 73 68 85 























* The totals reveal that the animals responded more accurately during the test trials when 
they were motivated by water deprivation. This is probably due to the fact that a 21-hour 
thirst drive possesses more drive strength than a 21-hour hunger drive. The work of Warner 
(6) supports such a conclusion. However, the results of the present training trials indicate that 
this relationship changes when both drives are present simultaneously. 


choices given daily. The data on the test days (T-1 and ‘T-2) 
represent the mean of the single trials for each of the 20 Ss. 

The standard error of the difference between the seventh day 
(47.5 percent) and the first test trial (85 percent) is 11.23. Therefore, 
this difference of 37.5 percent is 3.34 times its standard error. Re- 
sults for Test 2, in which the motivation was different than Test 1, 
indicate the goal object for Test 1 was chosen only 25 percent of the 
time, or in terms of the correct response, 75 percent, as is indicated 
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Fic. 2. Percent of choices to the side containing the goal object appropriate for the 
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by the large dot. This change from 85 percent to 25 percent in 
terms of the correct side for Test I is 4.78 times its standard error. 
Therefore, we may reject the hypothesis that these changes in 
preferences between successive test trials can be entirely due to 
chance. The level of confidence of this assertion is well beyond 
the one percent level. 

An additional statistical question which might be asked is whether 
the animals reacted appropriately to their dominant drive during the 
test trials? This question can be answered by testing the hypothesis 
that the true values during the test trials were 50 percent. The 
standard error of 50 percent is 11.18. The value 85 percent is 3.13 
standard error units from 50 percent while 25 percent is 2.24 units 
away. In both cases we may reject the hypothesis that the responses 
on Test 1 and Test 2 could be entirely attributed to chance.’ 


DIscuUSSION 


The results of this experiment are seen not to be in agreement 
with the expectation based on the assumption that both the hunger 
and thirst drive components of the stimulus pattern became condi- 
tioned to both spatial responses. The Ss trained under simultaneous 
hunger and thirst drives were able to make the appropriate response 
in a latter series of test trials when under either hunger or thirst 
motivation. A satisfactory account of this phenomenon requires 
some modification or supplementation of the principle of association 
or conditioning. 

One aspect of this principle that might be modified is the assump- 
tion that has usually been made that all stimulus components im- 
pinging upon the sensorium at the time of a rewarded response be- 
come associated with that response. An alternative assumption 


, would seem to involve a selective principle which would designate the 


fractional component of the stimulus situation which becomes con- 
nected to a reaction leading to reinforcement. ‘Thus, the present 
experimental findings would be consistent with a principle of as- 
sociation which stated that only those drive stimuli which are them- 
selves reduced become connected to a rewarded response.* Such 
reasoning would state that when the animal, both hungry and thirsty, 
turned left to attain food in the present experimental situation, only 


3 If we assume the 50 percent hypothesis to be true, we would find differences occurring by 
chance, in the direction obtained, less than one percent in terms of 85 percent, and less than 1.25 
percent in terms of 25 percent. 

4 It should be emphasized that this proposed selective principle of association of drive stimuli 
differs from a non-selective principle only in the specifications of what part of an existing stimulus 
drive complex becomes connected to a rewarded response, but does not differ with respect to the 
role of motivation and reward in learning. 
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the hunger drive stimulus became connected to the response, because 
the thirst drive stimulus failed to be reduced following the choice. 

Some confirmation of this analysis is contained in the results of an 
experiment recently published by the author (4). In a Skinner-box 
situation two groups of animals motivated simultaneously for food 
and water received only food reward. One group was motivated by 
12 hours of water deprivation and 22 hours of food deprivation (Group 
T-12), while the other group was motivated by 22 hours of water 
deprivation and also 22 hours of food deprivation (Group T-22). In 
the extinction series part of each group had their thirst drive changed 
to that of the other group, e.g., part of Group T-12 was extinguished 
under 22 hours of water deprivation, while the remainder was ex- 
tinguished under 12 hours of water deprivation. The former is 
referred to as [T-12(22) and the latter as T-12(12). The first number 
refers to the thirst drive condition during acquisition of the bar 
depressing response, while the number in parentheses refers to the 
thirst drive during extinction. The hunger drive, 22 hours of food 
deprivation, remained constant throughout all experimental sessions. 
[n a like manner, part of the T-22 group was motivated during ex- 
tinction by 12 hours of water deprivation [T-22(12) ], while the 
remainder possessed the same thirst drive during extinction as had 
existed during the training trials [T-22(22) ]. If we assumed the non- 
rewarded thirst drive stimulus was associated to the bar depressing 
response, we would expect the ‘changed’ groups [Groups T-12(22) 
and T-22(12) ] to extinguish more rapidly than the ‘constant’ groups 
[Groups T-12(12) and T-22(22)], because the former would be 
extinguished on a generalized drive state while the latter would be 
extinguished under the same drive condition that had prevailed 
during the training trials. Such was not found to be the case. A 
statistical analysis revealed no significant difference between the 
extinction rates of the combined ‘constant’ groups and the combined 
‘changed’ groups. Either the additional thirst drive did not become 
connected to the bar depressing response or the difference between 
the two thirst drives (12 and 22 hours of water deprivation) was not 
sufficient to produce a significant generalization effect. 

A different interpretation of the findings of this experiment, which 
makes use of certain mechanisms exploited extensively by Hull in 
his formulation of learning theory, has been offered by Spence (1) 
in an unpublished symposium paper presented at the 1941 meeting 
of the Midwestern Psychological Association. According to this 
interpretation, the stimuli in the food box and in the alley leading to 
the food box become associated with anticipatory eating responses 
(salvation, masticatory movements, etc.) and, in turn, the proprio- 
ceptive stimulus components resulting from these anticipatory goa! 
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responses become themselves conditioned to entering and continuing 
locomotion in the alley leading to the food box. Ina similar manner 
anticipatory drinking responses become conditioned to the alley 
leading to water and provide cues which tend to elicit the response of 
approaching and continuing locomotion in that alley. 

During the test series when only a single drive is present, for 
example hunger, the anticipatory goal responses related to that drive 
tend to be elicited and these provide cues which are conditioned to 
evoking responses of approaching and entering the food alley. The 
reason that the anticipatory food responses are dominant over the 
anticipatory drinking responses during such a test is that in the 
previous history of the animal the hunger drive has acquired a strong 
tendency to elicit eating responses and only little, if any, tendency 
to evoke drinking responses. In a similar manner, the presence of a 
thirst drive on a test day would tend to elicit vigorous anticipatory 
drinking responses which, in turn, would provide cues favoring the 
making of responses in the direction of the water alley. 

Finally, attention should be called to the fact that Tolman’s 
interpretation of learning would have no difficulty in explaining the 
present experimental results. According to this theory the stimulus 
cues in the alley leading to the food become signs indicating the 
subsequent significate, food, and that the sign-significate gestalts 
thus established by sequence in experience (principle of association by 
contiguity) and the momentary dominant motivation state determine 
which response will be made in the subsequent test situation. In 
some sense the interpretation offered by Tolman is similar to that of 
Spence above, except that it is more molar in nature. Spence’s 


interpretation differs, of course, in that it is basically a reinforcement 
theory. 


SUMMARY 


The purpose of this experiment was to evaluate Hull’s explanation 
of the Hull-Leeper phenomenon—the fact that organisms have been 
discovered to be capable of responding appropriately to a single- 
choice situation on the basis of momentary dominant drive states. 
In both experiments only one drive (either hunger or thirst) was 
present during any one training trial. Hull utilized the principle of 
patterning to explain this phenomenon, i1.e., the animal learned to 
respond one way to the pattern of choice-point and thirst-drive 
stimuli and another way to the pattern of choice-point and hunger- 
drive stimuli. 

The present experiment was designed to discover whether similar 
discrimination behavior could be learned if both hunger and thirst 
drives were present simultaneously during the training series. Strict 








220 HOWARD H. KENDLER 


adherence to Hull’s analysis with no other factors considered would 
lead one to expect failure of the animals to learn such discrimination, 
because of the existence of only one stimulus pattern during the train- 
ing trials. ‘The results of the experiment were not in agreement with 
this expectation. ‘The animals, trained under simultaneous hunger 
and thirst drives, were able to respond appropriately during the test 
trials when motivated by either hunger or thirst. 

Two possible explanations of the results were presented. One 
was based upon a selective principle of association which stated that 
only those drive stimuli which are themselves reduced become con- 
nected to the rewarded response. ‘The other explanation utilized 
the mechanism of the anticipatory goal response without making any 
revision of the law of association. 


(Manuscript received October 11, 1945) 
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STUDIES IN SPATIAL LEARNING. IT. PLACE LEARNING 
VERSUS RESPONSE LEARNING 


BY E. C. TOLMAN, B. F. RITCHIE, AND D. KALISH! 


A. INTRODUCTION 


Consider the case in which normal rats (i.e., those deprived of no 
sense capacities) have been trained to find food on a simple T-maze. 
After several days of such training we observe that whenever the 
rats are put at the starting place, they run quickly to the choice 
point and without hesitation turn down the path which leads to the 
food box. We then say that the rats have learned. But what is it 
that they have learned? There are at least three different answers 
which have been given to this question. 

1. Such training may have produced a disposition in the rats to 
run on a path which has certain specific characteristics (e.g., knot- 
holes of such and such a pattern, or the like) and to avoid running on 
all paths which have certain other specific characteristics. 2. Such 
training may have produced a disposition to turn right whenever they 
come to the choice point. 3. Finally, such training may have 
produced a disposition to orient towards the place where the food is 
located (e.g., under the window, to the left of the radiator, etc.). 

Each of these answers has at one time or another been defended 
by some psychologist as the only way in which rats learn mazes. 
Today, however, the first hypothesis has few supporters. The 
experiments of Honzik (1, pp. 17-19) and others on the sensory con- 
trol of maze learning have demonstrated clearly that learning in terms 
of intra-maze cues alone (i.e., when extra-maze stimuli are changed 
from trial to trial) is extremely difficult for the rat. Thus we must 
conclude that the rapid learning exhibited by rats in most maze 
problems is probably based on other cues than intra-maze ones. 

There is, however, no direct evidence which enables us to choose 
between the last two hypotheses. So far no experiment has been 
performed which has separated these two dispositions. In all T- 
mazes, as they have usually been constructed, running to a given 
place in the environment is always accomplished by a certain response 
(e.g., a right turn at the choice point) or set of responses. From such 
behavior it is obviously impossible to determine whether the training 


' The cost of this investigation was met in part by grants to the Department of Psychology 
from the Research Board of the University of California. 
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has produced a disposition to turn right or a disposition to go to a 
certain place. 

In the first paper of this series (2) we presented evidence that 
training on a particular path, which requires specific right and left 
responses, produces a disposition in the rat to take the shortest route 
to the food place when the original path is blocked. It is clear from 
this fact that the disposition which is acquired involves more than 
the tendency merely to make the original trained response. This 
fact by itself strongly suggests that what is learned in T-mazes 
where choices must be made is not a disposition to make certain 
responses (e.g., right turns) but rather a disposition to orient towards, 
or go towards the location of the goal. 

The purpose of the present paper is to present direct evidence for 
this latter hypothesis. In order to do this it is necessary, as we have 
pointed out, to construct a situation in which we can separate and 
distinguish the disposition to turn right from the disposition to go to 
the location of the goal. With this end in mind we have constructed 
a simple maze problem which can be arranged in two different ways. 
The only difference between these two is that in one arrangement the 
maze can be learned only 1f the rats acquire a disposition to run right, 
whereas in the other arrangement the maze can be learned only if 
the rats acquire a disposition to go to a constant food location. One 
group of rats will be run on each of these two arrangements of the 
maze and their learning will be compared. The group run in the first 
arrangement will be called “The Response-Learning Group’; the 
other will be called “The Place-Learning Group.’ 

Let us consider three possible results to such an experiment. I. 
The Response-Learning Group may learn but not the Place-Learning 
Group. Such a result would indicate that rats do not acquire dis- 
positions to orient towards the location of the goal. 2. The Place- 
Learning Group may learn but not the Response-Learning Group. 
This would indicate that rats do not acquire dispositions to make 
specific response (e.g., right turns). And3. Both groups may learn, 
but one of them may learn more rapidly than the other. Such a 
result would indicate that rats can acquire both dispositions, but that 
one of them is more native or primitive than the other (in the sense 
that brightness discrimination is more native for the rat than pattern 
discrimination). Such a result would also indicate that probably 
most of the rapid spatial learning exhibited by normal rats in other 
analogous situations consists in the acquisition of the more primitive 
kind of disposition. 


B. ANIMALS 


Two groups of eight male Mx M pigmented rats from the Tryon ‘bright’ and ‘dull’ stocks 
were used in this experiment. They were all approximately go days old, and had been on a 24- 
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.a 24- 
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hour wet food maintenance schedule for three days before the beginning of the training. During 
the experiment they were run and fed every evening at 9:00 P.M. 


C. APPARATUS 


The apparatus was an elevated maze (see Fig. 1) made of two-in. wide gray painted pine 
paths. The center path running from the food boxes at Fi to those at F: was eight feet long, while 
the paths S,C and S:C were each two feet long. On the center path, FiF2, four in. from each 
end were wire frames from which were suspended black curtains eight in. square. The food 
boxes contained four compartments which were six in. high, four in. wide, and to in. deep. In 
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Fic. 1. Elevated maze. S, and S2: starting points; F; and F2: food boxes; C: center point. 


each compartment was a glass bird bath, and on the rim of each of these bird baths was }% tea- 
spoon of wet food. 

This maze was set in a room (see Fig. 2) which was 30 by 45 feet. The ceiling of this room 
was 25 feet from the floor. The only illumination in this experiment came from the three over- 
head lamps shown in Fig. 2. The home cages were kept on a rack 15 feet behind the point Fy. 


D. MeEtTuHop 


Preliminary training.—The Place-Learning and the Response-Learning Groups were given 
the same preliminary training. During this period the eight-foot FiF: path was located in 
another part of the room (see Fig. 2) 20 feet away from the location where it was later used. 
Further, this path was placed at right angles to its later position. On the first day each rat 
was put by hand into one of the food boxes and allowed to feed for five min. Next it was put in 
the opposite food box and allowed to feed for another five min. Next each rat was placed in the 
middle of the F,F2 path and allowed to run to one of the two food boxes. After this the rats 
were put back in the home cages and half an hour later fed their daily ration. 

On the second day the rats were again started from the middle of the FiF2 path and allowed 
to run to one of the two food boxes. Next they were started in the same fashion but when they 
began to run towards, say F2, a block was placed in front of the food box. A similar block was 
placed at F, if they started in that direction. Thus, no matter which direction they chose they 
met a block and were forced to turn around and run towards the food box at the opposite end of 
the path. This procedure was repeated three more times before the rats were returned to their 
home cages. 

The Response-Learning Group, (N = 8).—On the third day the experiment proper began. 
The maze illustrated in Fig. 1 was used. Each of the rats in the Response-Learning Group was 
given six daily trials on this maze. On half of these trials the rats started from S; and on the other 
half from Sz. The order of these starting positions was S,S,S,S,S,S; and on alternate days 
S28,S,;S2S2S). 

Whenever an animal in this group was started at S;, the food box at F; was blocked, and the 
food box at F; was open. Conversely, when started at S; the food box at Fi was blocked and 
that at F; was open. Thus the animals in this group were required to learn always to turn right 
at the choice point C, whether started from S; or from S3. 
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Fic. 2. Maze in its environment 


An error was recorded whenever a rat turned and ran more than 12 in. toward the incorrect 
food box. The time elapsing between the instant at which the rat was put on the maze and the 
instant at which it made a choice was also recorded. 

The Place-Learning Group (N = 8).—On the third day the rats of this group were given one 
test trial to determine their position habits. The procedure of this test trial was as follows: 
Each rat was started at S;. If a rat turned right towards Fj, then a block was placed behind the 
curtain at F; so that the rat was forced to turn and go into Fs: at the opposite end of the path. 
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Conversely, if the rat turned left, a block was placed at F; and the rat was forced to go to Fy. 
After this initial test trial the apparatus was arranged for the experimental trials so that each 
rat in this group would have to go against its original position habit. Thus, within the Place- 
Learning Group we had § rats with initial biases for going to Fy and 3 with initial biases for 
coing to Fy. The former were required to go to F: on all experimental trials, while the latter 
were required to go to F; on all experimental trials. 

Each of these rats was given six daily trials, and the order of the starting positions was the 
same as that used for the Response-Learning Group. Thus the animals in the Place-Learning 
Group were required to learn to go always to the same place in the room whether started from S; 
or from Sy. Errors and times were recorded. | 


Both groups were run to a criterion of 10 successive errorless trials. 


E. REsutts 


The rats in the Response-Learning Group were run for 12 days 
or 72 trials. During this time only three of the eight rats in this 
group reached the criterion of 10 successive errorless trials (see Table 
I). The rest of the rats in the Response-Learning Group developed 


TABLE I 
TRIALS TO REACH CRITERION: RESPONSE-LEARNING GROUP 
Rat’s No. 32 33 35 36 72 73 78 79 
Trials 22 7a° 1S 15 93° 72* 72* o3° 


* These rats did not reach the criterion in 72 trials. 


habits of always going to the same place (either F; or F2), and thus 
their performance remained at chance. All of the rats in the Place- 
Learning Group reached the criterion within eight trials or less (see 


TABLE II 
TrIALs TO REACH CRITERION: PLACE-LEARNING GROUP 
Rat’s No. 21 22 24 25 26 28 29 30 
Trials 2 2 I 6 7 8 2 2 


Table II) and the mean number of trials to reach the criterion for 
this group was 3.5 trials. 

The differences between the two groups can also be seen in their 
error curves (see Fig. 3). The error curve for the Response-Learning 
Group comes down very slowly and finally after the 22nd trial reached 
a plateau at a value between four and five errors. ‘The error curve 
for the Place-Learning Group, on the other hand, comes down rapidly 
and reaches a plateau at zero errors on the roth trial. 

Despite the great difference in the error curves for the two groups, 
there was little if any difference between the two groups in the amount 
of hesitation they exhibited before making a choice (see Fig. 4). By 
the roth trial the rats of both groups were taking about four sec. to 
make their choices. 

On the 13th trial for the Place-Learning Group, the center path, 
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Fic. 3. Errors as a function of trials 


Fic. 4. Choice-times as a function of trials 





FF, was rotated end for end, so that the path which had previously 
led to the open food box now led to the blocked food box and vice 
versa. Inspection of the error and time curves (see Figs. 3 and 4) 
indicates that this change did not affect their behavior. 


F. Discussion 





R. S. Woodworth (3) after summarizing the results of the early 
experiments on the sensory control of maze learning concludes: 


Since neither chain reflex nor motor pattern accounts for the rat’s behavior in the maze, 
we ask once more what it is that the animal learns. The most obvious answer, which has 
been given repeatedly by investigators in describing the rat’s concrete behavior, though 
avoided in their theories, is simply that the rat learns the place. By place we mean a con- 
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crete situation containing objects in spatial relations. By learning the place we do not imply 
that the animal acquires a memory image which he can call up in the absence of the place; 
we need not credit him with any power of ideational recall. We do credit him with the 
simpler power of recognizing a presented object or situation. We credit him with some 
power of perception or observation, so that he can discover the character of different objects 
and different parts of the maze. He observes the food-containing character of the food 
box, the dead-end character of the blind alley, the particular odor of a bit of floor—and the 
location of these parts in relation to each other. The maze, at first a vague total, comes to 
have parts in definite location and with definite characters (3, p. 135). 


In this statement, Woodworth seems to make the same distinction 
which we made in the introduction to this article. In our terms, 
what Woodworth is saying is that training in a maze produces a 
disposition to go to certain places rather than a disposition to make a 
certain set of responses. Woodworth’s criticism of former experi- 
menters for avoiding this ‘obvious’ conclusion in their theories is not, 
however, quite fair. At the time that he was writing there was no 
unambiguous definition for the matrix “‘x learns the place of the 
food” or “x expects food at location L.” This fact may have been, 
and certainly should have been, the reason why these experimenters 
did not employ this explanation in their theories. 

In this first article of our present series (2) we presented a condi- 
tioned definition for the matrix ‘“‘x expects food at location L,” 
which makes it equivalent to the matrix “x runs down the path which 
points directly towards location L”’ when certain conditions are ful- 
filled. Furthermore, we presented evidence in the same article that 
training over a specific route produces such a disposition in many 
rats.2. Thus, if one accepts this definition, one can make an inter- 
pretation of Woodworth’s place-learning hypothesis which can be 
experimentally tested. ‘The experiment reported in the first article 
was such a test, and its results indicated that under some conditions? 
what 1s learned 1s a disposition to ortent towards the location of the goal; 
in short, to expect food at that location. 

Although the evidence presented in the first article indicated that 
such orientational dispositions were sometimes acquired by rats, it did 
not indicate that the rat acquired such dispositions when trained upon a 
T-maze where choices were required. ‘Two alternative ways of test- 
ing this latter hypothesis suggested themselves. 

1. We could train a group of rats on a simple T-maze, and then 
block them before they reached the choice point and discover which 
of a number of alternative paths they would choose. ‘This would be, 
of course, a repetition of the experiment reported in the first paper of 
this series and would differ from it only in the fact that the original 


2 It was assumed that with more training almost all the rats would have acquired this dis- 
position. Evidence for this assumption will be presented in a later paper in this series. 

3 Just what conditions are necessary and sufficient for the rats to acquire such dispositions is 
a problem to be investigated. 
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training would involve a choice rather than a single path. Such an 
experiment would enable us to decide whether such dispositions were 
acquired in T-mazes, but it would not enable us to determine (4) 
whether there were other kinds of dispositions which might also be 
acquired in such situations, or (#) should there be such another kind 
of disposition, which of the two kinds was the simpler and more primi- 
tive for the rat. Because of these considerations we decided upon 
another way of testing this hypothesis. 

2. We constructed a simple maze which could be arranged in two 
ways. One arrangement could be learned only if the rats acquired a 
disposition to turn right. ‘The other could be learned only 7f the rats 
acquired a disposition to go to a constant food location. By compar- 
ing the behavior of the rats in these two situations we could answer 
all the questions raised above. 

The results of this experiment indicate that both kinds of disposi- 
tions may be acquired by the rat, but that the disposition to orient towards 
the goal 1s simpler and more primitive than the disposition to make right 
turns. ‘This is confirmed not only by the fact that all of the Place- 
Learning Group learned more rapidly than any of the Response- 
Learning Group, but also by the fact that five of the latter group 
developed habits of consistently going to the same place, despite the 
fact that food was there only 50 percent of the time.‘ | 

Finally, one criticism is likely to be raised against our interpreta- 
tion of these results. Someone may argue that the Place-Learning 
Group learned more quickly than the Response-Learning Group 
because they merely had to learn to take the path which led from C 
to the food box no matter where they were started, while the Re- 
sponse-Learning Group had to learn to take different paths from 
different starting positions. Such a criticism assumes that spatial 
learning consists in the acquisition of dispositions to run on a path 
which has certain specific characteristics (e.g., knotholes of such and 
such a pattern, and the like). We said in the introduction to the 
present article that there is a great deal of evidence against this 
hypothesis and that as a consequence it is no longer widely held. 
However, in order to demonstrate that our rats did not acquire a 
disposition to take a particular path, we rotated the center path, 
F\F2, on the 13th trial for the Place-Learning Group. After this 
rotation the path which had led to the food box now led to the blind 
and vice versa. This rotation produced no change in the rats’ be- 


‘This is an interesting fact in itself, since most experimenters having observed ‘position 
habits’ in the usual T-maze, have interpreted them as habits of turning in the same direction 
(e.g., either always right or always left). The position habits exhibited by our rats, however, 
were ‘fixations’ on a particular end of the room. Thus, our rats who failed to solve the problem 
‘fixated’ in a way that involved 50 percent right and 50 percent left turns. 
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havior. They continued without hesitation to go to the end of the 
room which was correct. This fact indicates that their performance 
was completely independent of intra-maze cues, that is, that they had 
not acquired a disposition to take a particular path. 


G. SUMMARY 


1. The results of a previously reported experiment (2) suggested 
the hypothesis that what is learned in T-mazes, where choices must 
be made, is not a disposition to make certain responses (e.g., right 
turns) but rather is a disposition to orient towards the location of 
the goal. 

2. The experiment reported in the presented paper tests this 
hypothesis. ‘Two groups of rats were trained on a single unit maze, 
in which the starting path led into the choice point sometimes from 
the east and sometimes from the west. The Response-Learning 
Group (N = 8) was required to learn to turn always right. The 
Place-Learning Group (N = 8) was required to learn to go always 
to the same place, half the time turning left and half the time turning 
right. 

3. Only three rats in the Response-Learning Group reached the 
criterion (10 successive errorless runs) while the rest developed 
consistent habits of going always to the same place. All of the rats 
in the Place-Learning Group reached the criterion within eight trials 
or less. ; 

4. We conclude that in situations where there are marked extra- 
maze cues, place-learning is simpler than response-learning. 


(Manuscript received September 11, 1945) 
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SUCCESS AND FAILURE IN SERIAL LEARNING. 
I. THE THORNDIKE EFFECT? 


BY GEORGE A. ZIRKLE 
Lt. Comdr., USNR 


Hanover College, Hanover, Indiana 


In 1933, E. L. Thorndike (4) published a report of experiments on 
the influence of success and failure in serial learning. His genera] 
experimental procedure was to present a series of stimulus materials 
to human Ss several times in succession. Some of their responses to 
these stimuli were rewarded and others were punished. It was found 
that rewarded responses were repeated more often than punished 
responses on re-presentation of their respective stimuli. Moreover, 
a gradient of repetition of punished responses about a rewarded re- 
sponse was noted, the number of repetitions declining with increasing 
distance from the rewarded response. ‘This was true both before and 
after a rewarded response. Proceeding from these results, Thorndike 
concluded that “the strengthening influence of a reward spreads to 
influence positively not only the connection which it directly follows 
and to which it may be said to belong, but also any connections 
which are near enough to it” (4). The gradient phenomenon noted 
is regarded by Thorndike as an independent proof of the law of effect. 

Following up Thorndike’s technique, Lepley (1), Muenzinger and 
Dove (2), and Muenzinger, Dove, and Bernstone (3) obtained similar 
results. Muenzinger and Dove term the gradient of reproduction of 
unsuccessful responses about a successful response the “Thorndike 
effect.’ This convenient designation is adopted in this report. Wal- 
lach and Henle (6, 7) obtained results which failed to support those of 
Thorndike, however. These latter experimenters provided a situ- 
ation in which their Ss desired neither to remember nor to forget 
responses. Thus, the effect of rewarding any given response was not 
complicated by the additional effort of the S to remember or forget 
it. They concluded that reward as such does not favor recall. 

The experiments reported herein, and in a second article to follow, 
were designed to throw further light upon the nature of the Thorndike 
effect and the conditions surrounding its appearance. 


1 This article and one soon to follow are based on a dissertation presented in partial fulfill- 
ment of the requirements for the Ph.D. degree, Duke University, 1941. The investigation was 
directed by Dr. D. K. Adams and derived useful suggestions from unpublished work in the Duke 
Laboratory by Esther Bond Foster. 
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EXPERIMENT I[ 


In this experiment, a single series of 10 groups of stimulus ma- 
terials was provided. Each group of materials was internally similar, 
but unlike any other group. The object of the experiment was to 
determine whether the similarity of series items whose responses are 
punished to a series item whose response is rewarded will influence 
the reproduction of the punished responses. Now similar series 
materials would seem to ‘belong’ to each other more than to dissimilar 
materials. This being the case, it might be hypothesized with some 
reason that the ‘strengthening influence of a reward’ would spread 
to influence positively those stimulus-response connections which 
‘belonged’ to the rewarded connection more than those connections 
not belonging to it so clearly. Were this true, we would expect to 
find the Thorndike effect more pronounced on the similar side of a 
rewarded connection than on the dissimilar side. This hypothesis 
may conveniently be referred to as the ‘similarity hypothesis.’ 


Procedure 


The series materials were presented to the Ss by means of a rotary tachistoscope. There 
were 60 different objects in all. These were, in order, 10 nonsense syllables printed in large 
black capitals, 10 individually colored pieces of paper, 10 12-letter words printed in small, black 
capitals, 10 geometric figures drawn in black, 10 four-letter words printed in red capitals, and 10 
capital letters printed in black with a fancy script. The list was presented 10 times in succession, 
without interruption. A new object was shown every two sec. The time was measured by 
means of a metronome which was hooked up in an electric circuit with the tachistoscope. 

Twenty college students acted as Ss in the experiment. Each S was taken for two experi- 
mental sessions, with an interval of about a week between sessions. His responses to the stimulus 
materials were entered on a score sheet by the E and were immediately rewarded by being called 
‘right’ or punished by being called ‘wrong.’ Responses were called right or wrong according 
to a prearranged pattern, regardless of the numbers given. Such an arbitrary method of calling 
responses ‘right’ or ‘wrong’ has also been used by Thorndike (4), Wallach and Henle (5), and 
Muenzinger and Dove (2). Several Ss in this experiment and in other like experiments were 
asked if they thought the scoring method was arbitrary. A few said they were suspicious, but 
their suspicions appeared to come after rather than during the experiment. With so many items 
coming so fast, there is little or no time for logical deduction. 

During the first session, half of the Ss were given the list in which the response to the first 
object of a similar group was called right; the other half were given the list in which the response 
to the last object of a similar group was called right. On the second occasion this was reversed. 

The S was seated in a chair about four feet away from the E£, with his back turned toward 
the E. The following instructions were read to him twice: 


“T shall show you in the space on the apparatus before you a series of items to each of which 
is assigned a number from 1 to 10. You are to respond to each of these items immediately 
by calling out the first number from 1 to 10 inclusive that occurs to you. If you call out the 
right number I shall say ‘right,’ and if you call out any other number I shall say ‘wrong.’ ’’* 


In this experiment, as in all others, the E tried to use the same inflection in his voice for pro- 
nouncing both ‘right’ and ‘wrong.’ If an S called the same number over and over again, waiting 
to get it called right, he was instructed to mix his number responses. This happened in only a 
few cases. 





? These instructions are practically identical with a set given in an experiment by Muenzinger 
and Dove (2). 














232 GEORGE A. ZIRKLE 


The method of calculating responses was as follows. Each number association with an item 
which was called right was designated as a ‘right response’ and each number association with an 
item which was called wrong was called a ‘wrong response.’ When, and only when, a response 
occurring on the nth presentation was repeated on the (m + 1)th presentation, it was termed a 
repetition. Stated otherwise, the only responses calculated as repetitions were those which 
duplicated the responses given in the immediately preceding — of the list. Wrong 
responses preceding a right response were designated as ‘1 step,’ ‘2 steps,’ ‘3 steps,’ or ‘4 steps’ 
preceding a right response, depending upon the position relative to the right response. Wrong 
responses following a right response were similarly designated. Each response was counted only 
once; that is, the same response was never calculated as following one right response and also 
as preceding another. 


RESULTS 


Table I shows the declining gradient of the Thorndike effect, but 
Table II shows it only on wrong responses following a right response. 















































TABLE I 
Ricut Response FIRST or Like Group 
Wrong Responses Preceding Wrong Responses Following 
by Steps: : by Steps: 
Right 
Resp. 
4 3 2 I I 2 3 4 
Possible —- ...eeeee}/1IQ5 | 1189 | 1190 | 1189 | 1200 | 1196 | 1192 | 1189 | I190 
Actual repetitions..........}; 169 | 164 | 167 | I9I so | 252 | 178 | 162 | 157 
Percent repeated. . | gt] 13.8] 14.0) 16.1] 42.5) 21.1] 14.9) 13.6 13.2 
Percent repeated on n all four 
DE S45 kina oes oe res 14.5 | 15.7 
TABLE II 


Ricut Response LAST or Like Group 



































| Wrong Responses Preceding Wrong Responses Following 
by Steps: . y Steps: 
Right 
Resp - 
4 | a | a I r | 2 3 | 4 
Possible —- seeee eed AOD Tests | 1190 | 1189 |1194 | 1184 | 1179 | 1189 | IIgQ1 
Actual repetitions..........| 139 | 162 | 141 138 | 312 | 231 194 | 149 | 155 
Percent repeated. . | 11.6) 13.6] 1.8] 11.6) 26.1) 19.5) 16.5} 12.5) 13.0 
Percent repeated on n all four 
PL a 8400 F844 HOES Oe wee 12.2 | 15-4 











On the basis of the similarity hypothesis, wrong stimulus-response 
connections similar to a right stimulus-response connection would be 
repeated more than dissimilar connections. Table I shows repetitions 
where similar connections follow a right connection. In it we would 
expect to find greater percentages of repetition of wrong responses 
following a right response than on those preceding. Such is indeed 
the case, with a difference of 1.2 favoring the similar side. On the 
face of it, this result should tend to support the hypothesis of the 
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influence of similarity. However, this same difference in favor of 
repetitions after a right response obtains in a large number of the 
experiments reported by Thorndike, by Muenzinger and Dove, and 
also in the writer’s experiments. Those experiments were not 
adapted to testing a similarity hypothesis. In some of the experi- 
ments there was an appreciably greater difference than the 1.2 shown 
in Table I. 

Now, what of the results where the right response was made to 
the Jast word of a similar group? Here, on the basis of the similarity 
hypothesis, we expected to find a higher rate of repetition preceding 
a right response than following it. ‘Table II shows results which are 
exactly contradictory to such an expectation. ‘There is a 3.2 differ- 
ence favoring repetitions on wrong responses following a right re- 
sponse. This 3.2 difference, which was not expected on the basis of 
our initial hypothesis, is almost three times as great as the 1.2 differ- 
ence when the right response was to a word which was the first of a 
like group, which difference was expected on the basis of the similarity 
hypothesis. 


It does not appear from this experiment that the influence of a 
verbal reward ‘strengthens’ punished connections which ‘belong’ to 
the rewarded connection through virtue of similar stimulus items any 
more than it ‘strengthens’ punished connections which do not so 
‘belong.’ If anything, a dissimilarity hypothesis would seem to fit 
the results more than a similarity hypothesis. 


EXPERIMENT II 


This experiment was designed to determine the result for the 
Thorndike effect of shifting the relative order of ‘wrong’ series items 
about a ‘right’ item.* Thorndike has assumed that separate ‘con- 
nections’ or ‘bonds’ are established between stimulus items and the 
responses to them. Moreover, he has asserted that punished con- 
nections are established the more strongly the closer they are to a 
right connection. If separate connections are formed, we might 
anticipate that responses to stimulus items would be unaffected by 
changing the relative arrangement of items in a series from one 
presentation to the next. A punished connection which has been 
strengthened because it was one step from a rewarded connection 


_ should continue to be so strengthened even though its stimulus be 


changed to the fourth step on the next presentation of the list. If 
results should not bear out this expectation, it would then be in order 


>This usage is for economy of expression. Items whose responses were called wrong are 
termed ‘wrong’ items, Similarly, items whose responses were called right are termed ‘right’ 


' items, 
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to question Thorndike’s theory of unitary stimulus-response connec- 
tions which are strengthened in terms of their relative proximity to a 
rewarded response. 


Procedure 


The series materials were 60 eight-letter words, chosen at random. These were arranged 
in two different orders. A prearranged pattern for calling responses ‘right’ or ‘wrong’ was fol- 
lowed. This arrangement allowed for four wrong responses to occur on either side of a righ: 
response. Thus the Thorndike effect might easily be measured. The six words whose response; 
were called ‘right’ kept the same position in both orders of the series. Wrong words on the first 
and fourth steps preceding a right word were interchanged from the first to the second order. 
Wrong words on the second and third steps preceding a right word were interchanged from the 
first to the second order. The same was done with the four wrong words following a right word. 
Thus a complete reversal of position of wrong words was obtained from one order to the next. 
This is shown by the following samples from the two orders of the list. 


Order 1 Order 2 
printers (wrong response) presumes (wrong response) 
supports (wrong response) youthful (wrong response) 
youthful (wrong response) supports (wrong response) 
presumes (wrong response) printers (wrong response) 
sideward (right response) sideward (right response) 
trimming (wrong response) changing (wrong response) 
portrait (wrong response) sometime (wrong response) 
sometime (wrong response) portrait (wrong response) 
changing (wrong response) trimming (wrong response) 


The list was presented verbally by £ a total of seven times, with the two orders of the list 
coming alternately. There was no interruption between presentations. A new word was given 
every three sec. Timing was by means of the click of a metronome. 

Twenty-five college students took part in the experiment. Instructions to the Ss were as 
follows. 


“T shall read to you a series of words to each of which is assigned a number from I to 10. 
You are to respond to each of these words immediately by calling out the first number from 
I to 10 inclusive which occurs to you. If you call out the right number I shall say ‘right,’ 
and if you call out any other number I shall say ‘wrong.’ ” 


It was noted that some of the first Ss seemed to try very little to repeat the right responses 
and avoid the wrong ones. So the following sentences were added to the above instructions for 
the last 13 Ss: “I shall go through the list a number of times. Try to get as many right as 
you can.” 

The repetitions were calculated as in Experiment I, taking account of the shifts in positions 
of course. A number response to a word was counted as a repetition when it duplicated the 
number given for the same word in the immediately preceding presentation of the list. For in- 
stance, if the number ‘5’ was given for the fourth step word ‘printers’ in the first reading of the 
list and also given for ‘printers’ on the second reading, when ‘printers’ would be in the first step 
position, the duplication was counted as a repetition of a response to a word four steps removed 
from a right response. If the number ‘5’ was again given for ‘printers’ on the third reading of the 
list, when ‘printers’ would again be in the fourth step position, it was counted as a repetition of a 
response to a word one step removed from a right response. 


RESULTS 


On the basis of Thorndike’s interpretations, one might expect to 
find the effect regardless of the changed positions of the words. 
This expectation is not borne out in Table III, however. There is n 
clear-cut Thorndike effect. 











—as 


to 
‘ds. 
ni 


SUCCESS AND FAILURE IN SERIAL LEARNING. I 23 


Ww 



































TABLE Ill 
REPETITION OF RESPONSES TO SHIFTING WorpDs 
Wrong Responses Preceding by Steps: | | Wrong Responses Following by Steps: 
Right Reh ret 
Resp. | 
4 s 2 I I | 2 | 3 | 4 
Possible . | | 
repetitions.....| 756 756 756 756 756 756 756 756 756 
Actual repetitions.| 95 104 109 97 143 105 | 86 96 107 
Percent repeated..| 12.6 13.8 14.4 12.8 | 18.9 13.9 | 11.4 12.7 14.2 








It may be argued that the results obtained in this experiment 
might have been essentially the same even if the positions of the 
words had not been changed. ‘This isimprobable. In the first place, 
lists of words not very different from this list have been used in other 
experiments which do show the Thorndike effect. In the second 
place, this list, with few exceptions, was used in another experiment 
in which the Thorndike effect appeared clearly on the two steps pre- 
ceding a right response and on the three steps following a right 
response.4 

The failure of a clear Thorndike effect to appear in this experi- 
ment suggests that shifts in the positions of the stimulus words were 
at least partially instrumental in precluding its appearance. ‘The 
results are not conclusive, but indicate that we may look with 
question on Thorndike’s assumption that separate ‘connections’ or 
‘bonds’ are established between stimulus items and their responses 
and that these are acted upon as units by a satisfier to produce the 
gradient effect. Apparently other factors are involved in causing 
the effect. 

Suspecting that the position of the wrong responses relative to the 
rewarded response was involved in determining the results, a second 
and different check was made. This time protocols were checked for 
repetition of responses by step position, with no regard for the shifting 
positions of the stimulus words. A response which duplicated the 
response in the same series position on the immediately preceding 
presentation of the list was counted as a repetition. In practice, the 
technique of checking was exactly the same as in Experiment I. 
The results of this check are shown in Table IV. 

Since wrong words shift between step positions 1 and 4, their 
average proximity to a rewarded response would be equalized, and 
thus we might expect equivalent ‘strengthening’ of the connections. 
The same would be the case for words in positions 2 and 3. Hence, 
there should be no Thorndike effect on a calculation by position alone. 


‘This experiment will be reported in a second article to appear in this JouRNAL. 
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TABLE IV 















































REPETITION OF RESPONSES BY STEP PosITION 
Wrong Responses Preceding by Steps: Wrong Responses Following by Steps: 
Right 
Resp. 
4 3 2 I I 2 3 4 
Possible 
repetitions. ....| 756 756 756 756 756 756 756 756 756 
\ctual repetitions.| 91 84 go 101 143 122 95 72 2 
Percent repeated..| 12.0 11.1 11.9 13.4 18.9 16.1 12.6 9-5 9.5 





Yet a very definite Thorndike effect appears in Table IV on the first 
three steps before and after a right response. Table III, which gives 
repetition of responses in relation to stimulus words, failed to show a 
clear-cut Thorndike effect. But Table IV, showing repetition of 
responses by step position, regardless of stimulus words, does show 
the effect clearly. 

In terms of these results, it appears that the step position of a 
response is more important in relation to the Thorndike effect than 
the position which its stimulus item had in the preceding presentation 
of the list. Thorndike’s explanation of the gradient effect in terms of 
unitary stimulus-response connections which are strengthened in 
relation to their degree of proximity to a pleasurable after-effect fails 
to meet the test of explaining the facts. The results of this experi- 
ment suggest that the more important consideration is the proximity 
of responses to the pleasurable after-effect, and not the proximity of 
stimulus-response connections. 


(Manuscript received September 12, 1945) 
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THE EFFECT OF CHANGED POLARITY OF SET ON 
DECISION TIME OF AFFECTIVE JUDGMENTS 


BY WALTER C. SHIPLEY, ELIZABETH D. NORRIS, AND 
MARGARET L. ROBERTS 


Wheaton College, Norton, Mass. 


This study was planned to elucidate a finding previously reported 
by Shipley, Coffin, and Hadsell (1). Using the method of paired 
comparisons, with S instructed to choose the more pleasant of two 
colors, they found choice-time to be related not only to the affective 
distances involved, but also to the affective values of the colors them- 
selves. With the factor of affective distance between the comparison 
stimuli controlled, they found a significant tendency for choices 
between preferred colors to occur more rapidly than between unpre- 
ferred ones. Interpretation of this finding remained equivocal, how- 
ever, since the tendency could conceivably represent either a specific 
affectivity phenomenon, or a broader judgmental one, or both. Was 
it contingent upon pleasantness as such, or simply upon S’s specific 
set to look for pleasantness, i.e., to select the more pleasant color? 
The present experiment sought to illuminate this question through 
reversing the polarity of S’s set by instructing him to choose the Jess 
pleasing color. Under this circumstance, faster reactions to the 
preferred then to the unpreferred colors would point toward an affec- 


tivity factor, while the converse would point toward a more general 
judgmental one. 


SuBJEcTs, APPARATUS, AND PROCEDURE 


The Ss were 40 women college students.!. The apparatus and procedure, except for the 
reversed instructions, were identical in every respect with those of the above cited Shipley, 
Coffin, and Hadsell experiment, and will be reviewed here only in outline. 

Reaction times were recorded as the Ss indicated their preferences among six colors presented 
by the method of paired comparisons. Each S, after receiving some preliminary practice, was 
put through two complete series of 15 presentations each. A camera shutter simultaneously 
exposed the comparison stimuli and started an electric timer; reaction keys stopped the timer 
and indicated the choice. S read the following instructions: 


Sit in the chair and place your chin in the iron rest. Close your left eye. Look at the 
shutter with the right eye. Place the fore-finger of each hand lightiy upon the key. When 
the shutter opens look at the colors and indicate the color that you like /east by pressing the 
key on the side of that color. Don’t first choose the color that you prefer and then press the 


other key, but pick directly the color you like /east. Be careful to keep your left eye tightly 
closed. 





' These were different individuals from those who served in the Shipley, Coffin, and Hadsell 
experiment. 
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The experimentally obtained data were then converted into more refined basic units pre. 
liminary to their major statistical treatment; single reaction times were converted into percentage; 
of S’s mean reaction time, and choices were scaled for affectivity-value in standard-deviation. 
type units. This latter was done by finding the mean proportion of judgments favoring each 
choice (regardless of color), converting these mean proportions into standard measurements, 
and adding a constant to eliminate negative values; the mean proportions were obtained by the 
following formula 


C+N 
2nN 
with C representing the total number of preferences for any given choice (i.e., first choice, second 


choice, etc.), m, the number of colors, and N, the number of judges. The resulting choice-scal¢ 
values are given in Fig. 1. 





Mean proportion = 


Scale 
value N 


—I1st choice 2.11 38 
2.00 


“™— 2nd choice 1.58 28 
1.50 


—3rd choice 1.22 21 


1.00 
— 4th choice ~89 23 


imSth choice oe 25 


on 
=) 
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Fic. 1. Choice scale 


From the choice-scale values two major measures were obtained: (1) the mean affective 
distance involved in each choice combination, as when S’s first and third preferences were pre- 
sented together, and (2) the mean preference value of each such combination. The former was 
simply the difference between, and the latter the average of, the two given choice-scale values. 
Both measures were then correlated with the mean converted reaction times to reveal the re- 
spective pertinent relationships, i.e., the relationship between reaction time and affective distance, 
and that between reaction time and stimulus-preference value; and as a further index of this 
latter relationship, the tied-rank? data were analysed and preference values were compared for 
the faster and slower tied-rank reactions. 





2 Tied-ranks resulted when equal preferences were given two colors by the same S. 
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RESULTS 


A. Reaction Time 


The various Ss were found to vary considerably in reaction time. 
Their means, while averaging 1.53 sec., ranged from 0.75 to 4.17 sec., 
with a standard deviation of 0.73 sec., and single individual reactions 
ranged from 0.41 sec. to 9.03 sec. As explained above, each single 
reaction was converted to a percentage of the given S’s mean reaction 
time, and only these converted scores were used in the statistics 
to follow. 


B. Reaction Time in relation to Affective Distance 


The general relationship between reaction time and affective 
distance is shown graphically in Fig. 2, in which mean reaction times 
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Fic. 2. Reaction time in relation to affective distance 


are plotted against affective distances. A regression line, computed 
by the method of least squares, shows the general slope, and the 
obtained product moment coefficient of correlation for the plotted 
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points is r = — .87 + .089. The coefficient of determination of 
d = .76, arrived at by squaring this r, indicates that about 76 percent 
of the variance in mean reaction time is associated with variance in 
affective distance. The data from which Fig. 2 is plotted are given 
in Table I, which shows the mean reaction time, together with N and 
a, for each choice combination. 


TABLE I 


SHow1nGc Mean Arrective Distance, Mean Reaction Time, Reaction TIME o, 
AND N For Eacu Cuoice ComBINATION 











Mean Mean 
Choice Combination Affective Reaction Reaction Time o** N 

Distance Time* 
IR os tkwenunena 2.11 86 16.2 33 
PRS 6 6 haces desea 1.58 95 11.0 24 
PS bs vse evewenes 1.56 94 19.5 25 
ee 1.22 103 21.7 23 
re 1.22 100 17.0 17 
0 ee 1.03 103 28.1 18 
Ee 0.89 98 23.7 20 
rer 0.89 92 15.9 18 
errr 0.69 103 26.6 15 
IS a6 Nica, asm ow de 0.67 99 19.9 II 
eT 0.55 99 17.5 25 
EE 3 cit edn ee 0.53 113 22.4 27 
ee 0.36 104 23.7 20 
 ivnns ceinences 0.34 110 22.2 18 
CL Tere re 0.33 116 25.9 15 
Ee 0.00 III a 29 

















* Computed from converted scores; each single reaction was converted to a percentage of 
the given S’s average reaction time. 

** The two scores per S were averaged before computing the os. 

*** Based on averages of from two to 12 scores per S. 


C. Reaction Time in relation to Preference Value of Stimulus 


The relationship between reaction time and stimulus preference 
value, in which slow reactions were associated with high stimulus 
preference value, showed up consistently in several ways: (1) in the 
correlation between preference value and reaction time, (2) in the 
correlation between preference value and algebraic reaction-time 
deviation from the regression line, and (3) in the trend of the tied- 
rank data. 

Mean preference value and reaction ttme.—The correlation® between 
these two variables for the plotted points was found to be r = .29 
+ .17, with a coefficient of determination (r?) of d = .086. Since a 
complete lack of relationship was found to obtain between preference 
value and affective distance, namely, r = .006 + .18, it appears that 


* The N for this and the following r’s was 15, the plotted point representing the tied-rank 
mean having been omitted. 
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preference value was functioning as an independent determinant and 
accounting for about nine percent of the variance in mean reaction 
time. 

Mean preference value and mean algebraic RT deviation from the 
regression line.—The correlation between these two variables, for the 
plotted points, was r = .45 + .1§ for the two stimuli (choice scale 
values) combined; the corresponding correlations for the two stimuli 
taken separately were r = .40 + .17 for the more preferred, and 
r = .39 + .17 for the less preferred. 

Trend of the tted-rank data.—Analysis of these data, i.e., the data 
from the 33 instances in which an equal number of preferences was 
given to two colors by the same S, showed the mean preference value 
for the 16 fastest combined‘ tied-rank reactions to be 0.78 as compared 
with 1.16 for the 17 slowest, the CR, of the difference being 3.16.° 
This finding was almost completely independent of the others, since 
the tied-rank data were not included in the above correlations. 


RESULTS IN RELATION TO THE PREVIOUS EXPERIMENT 


The findings of the present study were strikingly similar to those 
of the previous Shipley, Coffin, and Hadsell study in every respect 
except one—the critical relationship between reaction time and 
stimulus preference value. 

Reaction times were of the same general order, though slightly 
slower in the present study; the slight difference between the means 
—1I.53 sec. as compared with 1.41 sec., with a CR of 0.88—was not 
statistically significant. Likewise the relationship between reaction 
time and affective distance was remarkably similar in the two studies; 
in both, faster reactions were associated in fairly linear manner with 
greater affective distance, the correlation between these two variables 
being .86 and .87, respectively, for the plotted points. This correla- 
tion was found to increase to .96 when the data from the two studies 
were combined. 

The relationship between reaction time and stimulus preference 
value, on the other hand, was completely reversed. The correlations 
in the previous study were all negative and fairly high, while those in 
the present one were all positive. These values are given in Table 
II. The tied-rank data, which were obtained completely independ- 
ently of the correlation data, also showed the reversal. In the 
previous study the mean preference score for the fastest tied-rank 


‘ The two reactions to a given choice combination were averaged to give S’s combined re- 
action, and S sometimes participated in more than one such combination. The preference value 
assigned a tied-rank was simply the midpoint between the scale values for the two integral ranks 
(choices) adjacent to it on the scale. 

5 With only five tied-rank categories the use of r seemed inadvisable here. 
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TABLE II 


SHow1nG CoMPARABLE CoRRELATIONS* BEARING ON THE RELATIONSHIP BETWEEN 
REACTION TIME AND STIMULUS PREFERENCE VALUE FOR (A) THE SHIPLEY 
Corrin & Hapse.tt Stupy, witH S Set To RESPOND TO THE more 
PLEASANT STIMULUS, AND (B) THE PreEsENT Stupy, witH S Set 
TO RESPOND TO THE less PLEASANT STIMULUS 


. oes 2 (B) 
Variables Correlated Shipley, Coffin & Present Finding 
Hadsell Finding 

(1) Preference value of two stimuli combined 

& 
(2) Algebraic RT deviation from regression line. .....r=—.70+.09 r=+.45+.15 
(1) Preference value of more preferred stimulus 

& 
(2) Algebraic RT deviation from regression line... ... r=—.58+.12 r=+.40+.17 
(1) Preference value of less preferred stimulus 

& 
(2) Algebraic RT deviation from regression line. ..... r=—.$9+.12 r=+.39+.17 
(1) Preference value of two stimuli combined 

& 
(2) Reaction time... ........ 0.0 cece ee ee eee ee PS AIG r=+.29+.17 


* All correlations based on 15 plotted points. 


reactions was 49 percent higher than for the slowest, with a CR of 
1.95; in the present study it was 33 percent lower, with a CR of 3.16. 


DIscussION 


The findings of the two studies taken together, with S discriminat- 
ing more rapidly between preferred colors when set to select the more 
pleasant, and between unpreferred when set to select the less pleasant, 
point toward the operation of a polarity of set factor manifesting itself 
in a slight but consistent tendency to react more rapidly as the stimuli 
possess that characteristic for which S has been instructed to look. 
Although this factor was demonstrated in judgments of color prefer- 
ence obtained by the method of paired comparisons, there is good 
reason to believe that it transcends both the materials and the 
method, and is an intrinsic determiner of speed in many instances of 
comparative judgment. 

It is also possible that a specific affectivity factor may have been 
operating over and above the judgmental one, since the correlation 
between preference value and reaction time was somewhat greater 
(sign disregarded) under the positive set. The data are not sufh- 
ciently striking in this respect, however, to do more than barely 
suggest such a possibility. 


SUMMARY 


Judgment-times were determined chronoscopically for 40 Ss as 
they indicated color preferences for six hues by the method of paired 
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comparisons. ‘The materials, apparatus, and procedure were identi- 
cal with those previously employed by Shipley, Coffin, and Hadsell, 
except that S was instructed to indicate the Jess instead of the more 
pleasing color. ‘Taken together, the results of the two experiments 
point to the operation of a polarity of set factor in speed of compara- 
tive judgment. With the factor of affective distance between the 
comparison stimuli eliminated or held constant, there was a con- 
sistent tendency to choose more rapidly between the pleasanter colors 
under the positive set (to select the more pleasant one) and between 


the less pleasant ones under the negative set (to select the less 
pleasant one). 


(Manuscript received September 17, 1945) 
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THE EFFECTS OF CARBON MONOXIDE ON THREE 
TYPES OF PERFORMANCE, AT SIMULATED 
ALTITUDES OF 10,000 AND 15,000 FEET* 


BY ERWIN P. VOLLMER, BARRY G. KING, JAMES E. BIRREN, AND 
M. BRUCE FISHER 


Naval Medical Research Institute, Bethesda, Md. 


INTRODUCTION 


Interest in the effects of exposure to carbon monoxide (CQ) has 
been intensified by the rapid expansion of military activities in en- 
closed spaces where exhaust gases may enter. Recent reports refer 
to the presence of the gas in hangers, aircraft carriers, and in planes. 
In aircraft it is feared that CO would aggravate a previously existent 
anoxia by further reduction of the amount of oxygen available for 
the tissues. According to some investigators (5) the presence of 20 
percent blood carboxyhemoglobin (COHb) impairs brightness con- 
trast discrimination at 12,000 feet to a degree equivalent to that 
expected at an altitude of 18,750 feet from anoxia alone. 

The kind and extent of impairment to sensory and motor functions 
resulting from CO poisoning cannot be predicted from present physio- 
logical knowledge. The effects of anoxia alone (simulated altitude) 
have been described by Birren and co-workers in terms of the ataxio- 
graph, perimetry, and critical flicker frequency measurements (1). 
Body sway increases, the size of the red field of vision decreases, and 
the critical flicker frequency threshold is significantly lowered at 
altitude. The purpose of the present study was to determine whether 
moderately increased carboxyhemoglobin in the blood increases the 
deterioration of performance of these tests which has been demon- 
strated in anoxia. The experiments were designed to indicate the 
extent of decrement and variability of response under given circum- 
stances within a group of subjects rather than to establish ‘altitude 
equivalents.’ 


PROCEDURE 


The procedure followed was very similar to that of the parallel research on the effects of 
anoxia (1), except for the additional preparations necessary because of the carbon monoxide 
treatment. In brief, the Ss were given three performance tests in a low pressure chamber: (a) 
at sea level in the chamber before ascent, (4) five times at equal intervals during an hour at 





* The material in this article should be construed only as the personal opinion of the writers 
and not as representing the opinion of the U. S. Navy Department. 
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altitude, and (c) at sea level immediately after descent. Each S had an anoxia run and one or 
more CO runs about a week apart, the sequence of the runs being varied from S to S. 

In all runs the following preliminaries to the chamber procedure were carried out. After 
the initial venous blood sample for COHb was taken, the S breathed a mixture of air and CO, or 
air alone. Another blood sample was then taken, and the S began his sea level tests. The 
samples were collected in a syringe from the antecubital vein, mixed at once with heparin under 
oil in a test tube, and analyzed by the method of Horvath and Roughton (4). 

The gas mixtures were breathed from a demand type mask connected through a gas meter 
toa spirometer. The mixtures were composed of pure CO diluted with air and mixed by repeated 
transfer between a Douglas bag and the spirometer. Samples were analyzed by a modification 
of the Nicloux iodine pentoxide method (6). The preparations used consisted of 28 and 56 parts 
of CO per 10,000, for the 10,000 and 15,500 feet runs respectively. Each S breathed 58 to 63 
liters of mixture after the tubing and gas meter had been flushed out; this amount was calculated 
to result in the formation of seven or eight percent COHb with the less concentrated mixture and 
15 to 18 percent with the more concentrated. Several formulas have been described for the 
purpose of producing definite increments in COHb; that of Pace and co-workers (7) was used in 
the preliminary calculation. A well-distributed range of COHb values within the desired limits 
was obtained when Ss breathed the stated amounts within a time (about five min.) too short for 
precise predication of COHb by the formula. Since pre-experiment COHb values up to nine 
percent were found and since analysis could not be made before the runs, the object was to obtain 
COHb increases within a specified range rather than specific values. 

Of 17 Ss who completed both experimental and control runs at 15,500 feet, 10 were unaware 
of the order in which their runs were made. A ‘dummy’ mixture (air) replaced the CO mixtures 
in these control runs. In this way the factor of suggestion as a determinant of functional impair- 
ment was minimized. 

The Ss were officers and enlisted men in naval service, 18-39 years of age. Only Ss who 
completed all tests in both anoxia and CO runs were included in the data. Three Ss completed 
two CO runs each, and the results of these two runs were averaged to count as one in the group 
statistics. When correlated with COHb percentages, however, each CO run was treated sepa- 
rately and rated according to the COHb values of that run. Data from Ss who collapsed ! were 


usually incomplete and were omitted, except those from one S who did not collapse until the end 
of a run. 


Test METHODS 


The three test measurements were body sway, size of ‘red visual 
field,’ and critical flicker frequency. Anterior-posterior body sway, 
at the level of the top of the head, was recorded by means of an 
ataxiograph (3). ‘Trials were four min. in length, scores being the 
sum of body sway for two min. with the S’s eyes open and two min. 
with eyes closed. The average of two such trials was used as the 
initial sea level value. The reliability of the four min. test is 0.87 (3). 

Measurements of the ‘red visual field’ were made by means of a 
Feree-Rand perimeter (Bausch and Lomb) and with the use of the 
standard 1° test object provided with the instrument. ‘The value 
recorded for each run interval was an average of readings made on 
each of the eight principal meridians; the initial sea level value, how- 
ever, represented a double series of such readings. The eight- 
measure ‘mean radius’ has a split half reliability of 0.93. 


1 Collapse in this paper implies that the S did not complete the measurements at altitude 


because he displayed symptoms of imminent syncope, was given oxygen and removed from the 
chamber. 
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The critical flicker frequency measurements were made by means 
of an apparatus (2) in which the S views binocularly a 1° test object 
18 in. from his eyes. The S reports verbally when the lamp appears 
to change from a flicker to a constant glow, or vice versa. The test 
object, a small section of a one-watt neon tube in a vacuum tube 
circuit, flashes at a frequency controlled by a variable step-wire 
resistor. ‘he readings were taken by the method of limits, the 
threshold being approached alternately from above and below. ‘The 
value for each run interval and for the final control was an average 
of four such readings, while the initial sea level control represented 
ten readings. In each trial the S was allowed one min. to become 
adapted to the brightness level of the test field. 

Before the experiments were carried out, all Ss participated in 
trials with the ataxiagraph, perimeter and critical flicker frequency 
apparatus, in order that they might become familiar with the equip- 
ment, procedure and judgments involved. 


RESULTS 


A sea level mean for each S was derived from the initial and final 
values in each run, and another mean was derived from the five 
values obtained at altitude. The difference between these two 
means represented the mean decrement in performance during an 
hour at altitude in each run. The decrement established in this way 
for the anoxia run, subtracted from the decrement in the CO run, 
gives a measure of the individual’s deterioration in performance 
attributable to increased COHb. No uniform tendency was apparent 
in individual data. Thus, 17 Ss completing a total of 20 CO runs at 
15,500 feet showed the greater deterioration with CO II times in 
perimetry, nine times in critical flicker frequency, and nine times in 
body sway. Deterioration in one test was not regularly paralleled 
by deterioration in the other tests. 

The differences in the group mean values of the CO trials compared 
with those for anoxia were generally in the expected direction, namely, 
decreased field of red vision, decreased critical flicker frequency, and 
increased body sway (Table 1). ‘They were small, and not significant. 
The P-values for the differences between decrements were: .40-.50 
in perimetry, .10 to .20 in body sway, and .70 in critical flicker fre- 
quency. According to these values, the changes in performance of 
Ss with 12-22 percent blood COHb at 15,500 feet were not dis- 
tinguishable from the changes in the same Ss with simple anoxia at 
the same altitude. 

The curve for the mean scores in body sway during the CO runs 
deviates upward from that of the anoxia runs, especially at the 
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beginning of the first hour at altitude (Fig. 1), but the difference at 
this point is not significant (P > .10). The difference at the point 
of greatest divergence between the curves is also not significant in 
critical flicker frequency (P > .60). The small and apparently 
consistent differences that are noted between the curves for the CO 
and anoxia runs are accentuated by the difference in the sea level 
values, presumably sampling error. The difference between the 
initial sea level means for the critical flicker frequency curve is the 
same size as the difference occurring during the hour at altitude; 
hence the experimental and control curves may be regarded as co- 
incident. With regard to the perimetry and body sway curves, 
however, the differences that occur during the hour are in general 
larger than the differences that appear between the initial sea level 
values. Therefore although the curves did not differ significantly 
at any given point, the consistency of the direction reflects a trend 














TABLE I 
Group MEans AND DECREMENTS IN THREE Tests GIVEN AT 15,500 Feet 1n ANOXIA 
AND Carson Monoxipe Runs (N = 17) 
Anoxia Anoxia plus CO 

D-D, 

Test (decre- 

Sea Level 15,500 Di Sea Level 15,500 D2 ment) 

Mean Mean (diff.) Mean Mean (diff.) 

Red field (degrees).| 26.75 25.68 — 1.07 27.06 25.69 — 1.37 | —0.30 

CP. CS).....1 SP 38.60 — 1.17 39.87 38.61 — 1.26 | —0.09 
Body sway 

‘ (cem./4 min.).....| 70.07 82.33 + 12.26 69.45 90.72 +21.27 | +9.01* 


























* Plus sign indicates increased sway, i.e., deterioration in this test performance. 


towards a real difference. By subtracting the difference between the 
initial sea level values from the differences occurring throughout the 
hour at altitude, the consistency of the trend can be checked. When 
this is done it is found that the body sway curves for the two condi- 
tions do not differ significantly (P = .5-.10), but the perimetry curves 
show a significant difference (P < .o1). It would appear therefore 
that there was a trend in some of the data indicating deterioration of 
performance, but the deficit appears small in the light of the large 
variations that arise from other sources, such as individual differ- 
ences, day to day changes in the Ss, and variations in the individual’s 
response to anoxia during the hour’s exposure. 

In order to correlate the quantities of CO absorbed with the test 
scores, each S was ranked according to the total percentage of COHb 
in his blood at the beginning of each run and according to the increase 
in percentage of COHb subsequent to exposure to the gas mixture. 
Each rating was compared with the S’s position in the group with 
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Fic.1. Group means on three tests at sea level and at 15,500 feet, with and without carboxy- 
hemoglobin (12-22 percent). Vertical lines indicate standard deviation of sea level means. 


respect to the difference between his performance decrements with 
and without CO at altitude. The rank-order correlations (p) for all 
tests were found to be small, indicating that, under the conditions of 


t! 


' 
e ' 
r 


st 
V 








EFFECTS OF CARBON MONOXIDE 249 


the experiments, decrements in performance were not correlated with 
either the total percentage or the increase in percentage of COHb 


(Table I I) . 


TABLE II 


RaNK OrpeER CorRRELATION BETWEEN TOTAL CARBOXYHEMOGLOBIN, OR INCREASE IN CARBOXY- 
HEMOGLOBIN, AND DIFFERENCE BETWEEN THE TEST PERFORMANCE DECREMENTS, 
with CO anp witHout CO, on Gorinc From Sea LeveL To 15,500 FEET 
SIMULATED ALTITUDE 


Rank Order Correlation Coefficients 


(p) (N =20) 
Test Total COHb A COHb 
IN Aha dydt ae at Sal ben s +.31 
aie he anne aa we casas oo, 0k — .60 
BE GG bak écvncnedd dc avaeec ee +.21 


Five Ss completed runs at 10,000 feet, with five to 10 percent 
COHb in their blood (Table III). The group was too small for 
statistical analysis, but the individual scores showed the same sort of 
variation as in the group who completed runs at 15,500 feet. 

Two Ss who collapsed from anoxia at 15,500 feet were not given 
CO runs. Six Ss participating in paired runs collapsed, four in CO 


TABLE III 


Group Means AND DECREMENTS IN THREE TESTS GIVEN AT 10,000 FEET tn ANOXIA 
AND CarsBon Monoxipe Runs (N = 5) 


























Anoxia Anoxia plus CO 

D-D; 

Test (decre- 

Sea Level 10,000 Di Sea Level 10,000 | D2 ment) 

Mean Mean (diff.) Mean Mean | (diff.) 
Red field (degrees).| 31.75 29.05 — 2.70 27.44 25.38 | —2.06 | + 0.64 
C.F.F. (e.p.s.).....] 39.48 38.41 — 1.07 40.48 39.89 | —0.59 | + 0.48 
Body sway | 

(cm./4 min.).....| 60.77 57-52 — 3.25 47.24 | 55.50 | +8.26 | +11.51 











runs at 15,500 feet, one in a CO run at 10,000 feet, and one in a control 
run at 15,500 feet. ‘The latter succeeded in completing his CO run. 
All but one of these Ss showed signs of collapse early and made rapid 
recovery with oxygen. ‘The severest collapse occurred in an S with 
15.6 percent total COHb but after he had completed all tests at 
15,500 feet. As with anoxia alone, body sway increased markedly 
just prior to collapse, but large increases were not always followed by 
collapse. Suggestion does not appear to have been an important 
factor in collapse, since the changes in performance of Ss with knowl- 
edge of the gas mixture breathed were no greater than in those frem 
whom that knowledge was withheld. 
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DIscuSSION 


That there were found no significant differences in performance 
attributable to CO does not necessarily mean that individuals were 
unaffected by the gas. It indicates rather that in this relatively un- 
selected group there were large variations which may have masked 
any uniform tendency to impairment beyond that attributable to 
anoxia alone. 

Compensatory reactions are called into play by anoxia, and the 
effectiveness of these reactions, differing from individual to individual, 
probably accounts for some of the variations. Foremost among 
these are the cardiovascular and respiratory changes. Cardiac out- 
put increases, partly as a result of increased pulse rate. More 
effective ventilation and a rise in minute respiratory volume may 
increase arterial saturation through displacement of the oxygen 
saturation curve to the left. These and other compensations may 
serve to supply sufficient oxygen to tissues to prevent definite impair- 
ment of individual function in Ss at rest. At higher levels of activity 
the amount of available oxygen may not be sufficient, even though 
the circulation and the arterial-venous oxygen difference may increase. 

When anoxic anoxia is complicated by formation of COHb the 
ratio of oxyhemoglobin to total hemoglobin, and also the oxygen 
content, decreases. On the other hand, the saturation of the hemo- 
globin which remains available for oxygen carriage may be increased. 
Thus, even if the oxygen content diminishes and oxyhemoglobin 
becomes more stable, diffusion to the tissues may be favored by the 
resulting oxygen tension. ‘Tissue anoxia develops only when the 
rate of oxygen supply falls below metabolic requirements. If the 
latter are low, as when an S is at rest or engaged in light activity, the 
oxygen supply may be sufficient to prevent tissue anoxia. The 
situation is different if the metabolic needs of the tissues are high. 
Pitts and Pace (8) have shown that during moderate exercise the pulse 
rate increases progressively with increasing altitude, and that follow- 
ing CO administration the pulse rate increments are greater than those 
at corresponding altitudes during uncomplicated anoxia. 


SUMMARY AND CONCLUSIONS 


1. Measurements of the critical flicker frequency threshold, body 
sway, and the red visual field were made on Ss before, during, and 
after low pressure chamber runs. Twenty Ss with 12 to 22 percent 
blood COHb took part in runs at 15,500 feet, and six Ss with § to 10 
percent COHb were tested at 10,000 feet. Control runs were made 
at the same altitudes with the same Ss. 

2. There was a significant impairment of performance at altitude, 


to 








EFFECTS OF CABRON MONOXIDE 


to 
Ww 
—— 


both under conditions of anoxia alone and anoxia after exposure to 
CO, as compared with performance at sea level. 

3. There was no statistically significant difference between the 
mean scores of the tests during anoxia and during anoxia following 
administration of CO. Furthermore, the time-performance curves 
were not found to differ significantly at any point during the hour at 
altitude. 

4. Individual responses were variable and without correlation 
with the percentage of increment or of total COHb in the blood. 
Three of the Ss who started the experiments showed symptoms of 
impending collapse at 15,500 feet without CO. One of these success- 
fully completed his corresponding CO run; the other two did not 
participate in CO runs. Five Ss showed symptoms of impending 
collapse at altitude (one at 10,000, four at 15,500 feet) after they had 
breathed mixtures containing CO. 

5. The results may be interpreted to mean that during anoxia 
(breathing air at 15,500 feet) the additional burden imposed by 
further reduction in arterial oxygen by formation of small amounts 
of COHb is masked by compensatory mechanisms. Alternately, 
they may be interpreted to mean that increase of nine to 19 percent 


COHb do not impose an important additional stress during light 
activity. 


(Manuscript received November 26, 1945) 
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THE LEARNING AND RETENTION OF CONCEPTS. IV. 
THE INFLUENCE OF THE COMPLEXITY 
OF THE STIMULI 


BY HOMER B. REED 
Ft. Hays Kansas State College 


The problem of this investigation was to find out the extent to 
which the formation and retention of concepts was a function of the 
complexity of the stimuli from which they were derived. More 
specifically, what is the relation of the complexity of the stimuli to 
(1) the effort required to learn and relearn concepts, (2) the difference 
in effort required to learn and relearn consistent and inconsistent 
concepts, (3) the rate of forgetting concepts, (4) the percent of con- 
sistent concepts acquired, and (5) the kinds of erroneous concepts 
formed and their distribution? 


The method was the same as heretofore described (1). Forty-two cards having a nonsense 
syllable on the back and English words on the front, one of which belonged to a category repre- 
sented by the syllable, were presented at the rate of one card every seven sec. until the S could 
respond correctly to each card without assistance. ‘The materials were the same except that the 
stimuli were made more complex by (1) increasing the number of words and (2) introducing con- 
fusing categories based on a varying number of instances. Four degrees of complexity were 
devised as follows: 

First degree, or Group 4 ab: four words on a card one of which belonged to a category chosen 
to belong to the nonsense syllable. 

Second degree, or Group 10: same as first degree, except that a second line of four unrelated 
words was added so that the S had to select the key word from one of eight words. 

Third degree, or Group 8: same as first degree, except that a second line of four words was 
added, one of which belonged to a confusing category. Three confusing categories were intro- 
duced for each syllable, which therefore stood for four categories or concepts: one with three 
instances, one with four instances, one with five instances, and one with seven instances. The 
one with seven instances was called the correct or consistent concept since it was the only one in 
the series which fitted all the cards having a given name and for which there were no contrary 
or negative instances. Since there were six syllables and 42 cards, the maximum number of 
instances for a given concept was seven. In the third degree, the S therefore not only had to 
select the key word from one of eight as in the second degree, but also to discriminate the concept 
which fitted all the cards having the same name from those which fitted only part of them. 

The foregoing arrangement required some changes in the first series of English words used 
with Group 4 and § as reported in a previous paper. The concepts and the number of instances, 
frequencies, or words for each syllable were as presented in Table VII. 

Fourth degree, or Group 9: same as third degree, except that the second line used to make 
the second degree of complexity was added to each card in this series so that the S had to select 
the key word from one of 12 words. 

In each of the degrees beyond the first, the position of the line having the key word for the 
correct concept was changed irregularly. In the second and third degrees, it was irregularly the 
upper and lower line. In the third degree, it was irregularly the first, second, or third line on 
each card. This was done to counteract the position habits that might be made in favor of one 
line. 
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TABLE VII 


SYLLABLES, CONCEPTS, AND FREQUENCIES OF Worps BELONGING TO EACH CONCEPT 


r Frequency of Words 
Syllable Concept Belonging to Concept 
Kun animal 
relatives 

money 


dishes 


wrhuns 


Vor love 
furniture 
weather 
meals 


wtruis 


Yem flowers 
metals 
buildings 
grains 


wrus 


Bep vegetables 
beverages 
time 
locomotion 


wr u“s 


Dax color 
tools 
clothes 
school subjects 


whl 


Jik tree 
body parts 
number 
vehicle 


wr Nin 


The number of Ss used was 19 in Group 4 ab, 20 in Group 10, 25 in Group 8, and 26 in 
Group 9. Most of them were junior-college students taking beginning courses in psychology. 
By means of the Henmon-Nelson Test of Mental Ability for College Students, each group was 
divided into two equal sub-groups a and b, which relearned after one week and three weeks 
respectively. To each S was read Directions No. 4, which required each S not only to learn the 
name of each card but also to find its meaning. 


The problems presented to the learner in these situations are 
analogous to problems in analysis occurring in life situations in which 
the first degree requires the finding of a constant appearing among 
three variables; the second degree, the finding of a constant appearing 
among seven variables; the third degree, the finding of a constant 
appearing among seven variables, four of which are partial constants; 
and the fourth degree, the finding of a constant appearing among II 
variables, four of which are partial constants. 


RESULTS 


1. Relation of complexity of stimuli to amount of effort required to 
learn and relearn the concepts. 

We would expect that the more complex a problem is the greater 
the amount of effort required to solve it. ‘To measure the exact 
relationship, we used as a measure of the amount of work the average 
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TABLE VIII 





Mean (M) NumsBer oF Promptincs (P) Requtrep To LEARN AND RELEARN 
iN RELATION TO ComPLEXITY OF STIMULI 












































D f , No. of M.P. M.P. 
Coane Group Cencame tg S.D. te S.D. 
1. First. 4 ab 114 30.70 13.40 1.35 1.76 
2. Second...... 10 120 39-95 20.75 2.57 2.75 
ea 8 150 49-20 23.70 3-43 3-35 
4. Fourth 9 156 54-50 20.85 5.89 5-25 

Differences 

M-M, | | | 9.25 2.25 1.22 0.30 
MM, | 18.50 2.70 2.08 0.37 
MM, | 23.80 2.52 4-54 0.53 
M,-™; | 9.25 2.29 0.86 0.30 
M-M; 14.55 3-32 2.07 0.50 
M.-M;, 5-30 2.55 2.46 0.31 




















number of promptings per concept required to enable the S to respond 
To prevent 


correctly without assistance to all the cards in a series. 


serious differences in the degree of learning, the same criterion was 


required in the series of each degree of complexity. 
Table VIII and Fig. 5 set forth the results. 
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In Fig. 5, it is shown 
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Relation between complexity of stimuli and amount of effort 


Fic. 6. Relation between complexity of stimuli and retention of concepts 
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that the amount of effort to learn increases in almost a straight line 
manner as the complexity increases. From Table VIII, it may be 
seen that the difference between the first degree and each higher 
degree is much larger than necessary to be statistically significant and 
also that the greater the difference in complexity, the greater is the 
ratio of the differences to its standard deviation or error. The same 
is true of the difference between the second degree and each higher 
degree. Even the difference between the third and fourth degrees 
, is large enough to be statistically significant, although less so than the 
others. 

The same statements can be made with respect to the relation 
between effort to relearn and the complexity of the stimuli. Fig. 5 
shows that the slope of the relearning curve is less than for the learning 
curve, which means that the rate of increase in work from one degree 
of difficulty to the next is less in relearning than in learning. Al- 
though the rate of increase in work is small, yet the differences in 
amount of work between each degree of difficulty and any other 
degree of difficulty used is statistically significant. 

2. Relation of complexity of stimuli to difference in effort required 
to learn and relearn concepts. 

In our previous investigations, we found that inconsistent concepts 
required considerably more effort than consistent ones. This was 
true regardless of the set of the S, or length of the series. Does this 
remain true when the complexity of the stimuli is increased, and does 
the amount of difference in effort to learn and relearn consistent and 
inconsistent concepts vary with the degree of complexity? Table IX 
sets forth our results on this problem. 

Table IX shows that except for the third degree of complexity, 
the usual difference in favor of consistent concepts holds. ‘There is 
no consistent variation in the amount of the difference in relation to 
the degree of complexity, but if we compare the difference for the 
fourth degree with that of the first and second degrees there appears 
to be a trend in favor of a decrease in the amount of the difference 
with an increase in the degree of complexity. The difference for the 
third degree is a striking exception to to the rule. Here there is a 
large difference in favor of inconsistent concepts. This can be ex- 
plained in terms of the processes used by the learners to produce 
these results and in terms of the distribution of the types of learners. 
It sometimes happens that a learner will use illogical methods for 
each of the syllables, that is, he will try to learn the syllables by letter 
and sound associations between the syllables and the stimulus words, 
or he will connect the syllables with the first word on each card, or 
he will try to memorize the order of the syllables. After a dozen 
trials he may have learned to respond to most of the cards by such 
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methods, but then he discovers that words of a certain kind go with 
some of the syllables, and during the last one or two trials, he dis- 
covers the consistent concepts of a few of the syllables. Under such 
conditions, the number of promptings for a consistent concept may 
be as great as for an inconsistent one. For example, Subject MB 
had only word and letter associations with each of the syllables until 
Trial 12, when she discovered the meaning of Kun. In Trial 18, she 
discovered the meaning of yem, bep, and 71k and made a good guess 
ondax. Finally, in Trial 20, she reached the criterion and discovered 
the meaning of dax. She then had all of the concepts except vor 
which required 98 promptings. For the other syllables, the number 


TABLE IX 


Mean (M) NumsBer or Promptincs (P) to LEARN AND RELEARN CONSISTENT 
AND INCONSISTENT CONCEPTS IN RELATION TO COMPLEXITY OF STIMULI 























D f . No. of M.P.t : M.P.t 

Canbieainy Group Canmate Method pty S.D. ielenen S.D. 
I 4 ab 98 Consistent 29.19 12.05 0.91 1.61 
4 ab 16 Inconsistent 40.13 17.10 2.06 2.22 
Difference 10.94 4.44 1.15 71 
2 10 70 Consistent 34-25 20.90 1.41 2.03 
10 50 Inconsistent 47.98 17.40 4.20 3-03 

| —_—_—— —_— 
Difference 13.63 3-51 2.79 49 
3 8 31 Consistent 64.75 30.88 1.68 2.35 
8 119 Inconsistent 45-07 19.16 3-72 3-49 
Difference — 19.68 5.82 2.04 51 
4 9 19 Consistent 46.20 20.15 7.85 6.15 
9 137 Inconsistent 55-65 20.50 5-39 5-38 
Difference 9-45 2.52 — 2.45 1.66 

| 























of promptings were 77 for kun, 82 for yem, 91 for bep, 107 for dax, and 
123 for 71k, or an average of 96 for inconsistent concepts—an in- 
significant difference. This was an unusually slow learner. There 
were three Ss of this type in Group 8. Against this there were Io Ss 
who reached the criterion in less than an average of 50 promptings 
per concept, and none of them acquired a single consistent concept. 
Hence, the negative difference in favor of inconsistent concepts in 
Group 8 is due to having a small group of logical plodders on one 
hand and a much larger group of quick but illogical learners on the 
other. 

3. The relation of complexity of stimult to the rate of forgetting 
concepts. 
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As previously stated, the Ss of each group were divided into two 
subgroups @ and b, which relearned after one and three weeks re- 
spectively. The concepts formed by each of the Ss were divided into 
two classes, consistent and inconsistent, according to approved 
criteria. This enabled us to discover the rate of forgetting for each 
class. We have already shown that inconsistent concepts are a little 
more rapidly forgotten than consistent ones. ‘The question now is 
to what extent the amount and the rate of retention is influenced by 
the complexity of the stimuli. The results given in Table X and 
Fig. 6 enable us to answer this question. Since we wish to know the 
rate in relation to the complexity of the stimuli regardless of the 
character of the concept as well as with respect to it, we first com- 


TABLE X 


THe RETENTION oF CoNCEPTS IN RELATION TO CoMPLEXITY OF STIMULI 


























Average Percent of Retention 
coe. colin After 2 Weeks After 1 Week After 3 Weeks 
Consistent | Inconsistent} Consistent | Inconsistent} Consistent | Inconsistent 
I 96.58 95-30 92.60 97.60 95-40 95.60 91.20 
2 95-57 95-90 91.25 99-79 92.24 88.78 95-71 
3 92.03 97-41 QI.15 98.01 94.80 96.35 88.30 
4 89.19 83.03 90.31 96.39 92.96 87.89 86.71 




















puted the average retention of both aand } groups. ‘This would give 
an average retention after two weeks. 

Fig. 6 and column 2 of Table X show a slight inverse relation 
between retention and degree of complexity. Retention decreases as 
complexity increases, a relation that is in harmony with the interfer- 
ence theory of forgetting. From columns 3 and 4, it appears that 
the retention of consistent concepts is more affected by complexity 
of stimuli than is that of inconsistent concepts. ‘The last four 
columns show that with one exception, the amount of retention for 
equal intervals is higher for consistent than for inconsistent concepts, 
but that with respect to the intervals tested there appears to be little 
difference in the rate of forgetting. 

4. The relation of complexity of stimult to the percent of consistent 
concepts acquired. 

To discover how complexity of stimuli affects the percent of 
consistent concepts formed, we computed the number and percent of 
such concepts that were formed for each degree of complexity. We 
also computed the effect of the complexity on the amount of error. 
For a rough measure of the amount of error, we simply counted the 
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total number of wrong concepts given in each group and divided it 
by the number of Ss, thus getting the average amount of error per 
subject. Table XI and Fig. 7 set forth the results. 

Fig. 7 shows that the curve for the percent of consistent or correct 
concepts falls very rapidly in relation to increase in complexity of 
stimuli. Likewise, the curve for the average amount of error per 
S rises very rapidly in relation to this factor. These are the most 
striking effects of the complexity of the stimuli. This is not unreason- 
able when we consider that in Degree 1, the S must select from 168 
words 7 words which fall into a single group. In Degree 2, he must 
select such a group from 336 words. In Degree 3, he must select the 
group from the same number of words and further discriminate it 
from three other logical groups which have varying amounts of 
evidence in their favor. And in Degree 4, he must select the group 
from a similar situation in which there are 504 words. 


TABLE XI 


Tue RELATION OF COMPLEXITY OF STIMULI TO PERCENT oF CONSISTENT 
or Correct Concepts Formep 











: Average No. of 
: Percent Consistent : 
Degree of Complexity Group Inconsistent Concepts 

Concepts Per S 
I 4 ab 86 17.17 
2 10 58 40.03 
3 8 21 69.76 
4 9 12 86.46 














5. The retention of complexity of stimuli to the kind of erroneous 
concepts formed and their distribution. 

We have just observed the effect of complexity on the number of 
errors. Does it also affect the kind of errors and their distribution? 
To solve this problem, we counted and classified the wrong concepts 
reported at the end of the last trial for each S in Group 8 or Degree 3. 
The classification table used was similar to that shown in Table VI 
in a previous article (1). We found it necessary to add some new 
categories to provide for the errors related to the confusing concepts. 
These were of two kinds: (1) those that related to a single confusing 
concept, and (2) those that related to two or more confusing concepts. 
As previously stated, there were three confusing concepts for each 
syllable supported by five, four, and three instances respectively. 
Some Ss concluded that a syllable stood for only one of them, for 
example, kun stands for relatives. Others reported what might be 
called a double or multiple concept, for example, kun stands for 
animals and relatives; or vor stands for furniture and weather; or yem 
stands for metals, buildings, and grains. After a distribution table 
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was constructed showing the comparative percentages of each class 
of error for Degrees 1 and 3, we noticed significant differences in only 
three classes of error: primacy or first-word position errors, single 
confusing concepts, and double concepts. In Degree 3, primacy 
accounted for 43 percent of 770 errors reported; in Degree 1, the 
corresponding percentage was 29. In Degree 3, 15 percent of the 
errors related to confusing concepts while in Degree 1, only 3.6 percent 
of the errors could be so classified. In Degree 1, there were no in- 
tentional confusing concepts introduced by the £, but some of the Ss 
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Fic. 7. Relation of complexity of stimuli to percent of correct concepts 
and average number of errors per S 


Fic. 8. Relation between frequency of supporting instances for a given 
concept and percentage of occurrence of that concept 


were able to find certain accidental or vague groupings. ‘These facts 
show that as the complexity of the stimuli is increased there is a 
change in the S’s method of learning from logical to illogical pro- 
cedures. There is an increasing tendency to base concepts on such 
factors as the primacy and sensory similarity of contiguous stimuli, 
and also on their frequency, which we shall discuss in the next 
paragraph. 

Besides increasing complexity, an additional. purpose for intro- 
ducing the confusing concepts was to discover whether the varied 
frequency of instances of support for them would produce a corre- 
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sponding frequency of occurrences in the Ss’ reports. We therefore 
calculated the percentage of occurrence of error for each degree of 
frequency of instances or supporting words. ‘Table XII and Fig. g 
set forth the results. 

We see a perfect correlation or almost a straight-line relation 
between the frequency of the stimulus for a given concept and the 
percentage of the occurrence of that concept in the observer. Besides 


TABLE XII 


COMPARISON OF FREQUENCY OF SUPPORTING INSTANCES FOR GIVEN CONCEPTS 
AND THEIR CORRESPONDING PERCENTAGE OF OCCURRENCE 


Frequency of Supporting 


Pr. Percentage of Occurrence 


ae ee 
Be ase .. 8.70 
4: . §.85 
a. . 1.04 
$.. .78 


primacy and recency of stimulus as important causes of error, we 

have here evidence of another cause of error, frequency of evidence 
b] 

in favor of a competing concept. 


SUMMARY 


The problems of this investigation were to find the relation of the 
complexity of the stimuli to (1) the effort required to learn and re- 
learn concepts, (2) the difference in effort required to learn and re- 
learn consistent and inconsistent concepts, (3) the rate of forgetting 
concepts, (4) the percent of consistent concepts acquired, and (5) 
the kinds of erroneous concepts formed and their distribution. 
Forty-two cards having a nonsense syllable on the back and English 
words on the front, one of which belonged to a category represented 
by the syllable, were presented to the S, whose task was to learn the 
names of the cards and discover the categories for which the syllables 
stood. Four degrees of complexity were used which were effected 
by (1) varying the number of English words on the cards, and (2) by 
introducing confusing concepts having different amounts of support. 
The principal conclusions reached are the following: 

1. The amount of effort required to form concepts varies directly 
with the complexity of the stimuli from which they are derived. This 
is true for both learning and relearning, but the rate of increase in 
effort from one degree of complexity to the next is less in relearning 
than in learning. 

2. Consistent concepts as a rule require less effort to learn than 
inconsistent ones, but there is a trend in favor of a decrease in the 
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amount of the difference when the complexity of the stimuli is 
increased. 

3. There is a slight inverse relation between the amount of re- 
tention of concepts and the degree of complexity of the stimuli from 
which they are derived. 

4. The percent of correct or consistent concepts formed decreases 
very rapidly as the complexity of the stimuli is increased, and this is 
accompanied by a comparable rate of increase in the number of in- 
correct or inconsistent concepts. ‘These are the most striking effects 
of the complexity of stimuli, and they indicate that if the formation 
of the correct concept is the goal of the learning process, careful 
attention must be given to complexity or simplicity of the material. 

5. Complexity of stimuli has important effects on the kind and 
distribution of errors. Introducing confusing or conflicting concepts 
leads to the formation of concepts with double and multiple meanings. 
As the complexity of the stimuli is increased there is a definite trend 
to shift from logical to illogical learning, or to base concepts on such 
factors as the primacy, frequency, and sensory similarity of contigu- 
ous stimuli. A direct relationship was found between the frequency 
of the occurrence of concepts and the number of instances supporting 
them. 


(Manuscript received July 31, 1945) 
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RETROACTIVE INHIBITION AS A FUNCTION OF THE 
RELATIVE SERIAL POSITIONS OF THE ORIGINAL 
AND INTERPOLATED ITEMS 


BY ARTHUR L. IRION 


State University of Iowa! 


A transfer theory of retroactive inhibition? carries with it the 
implication that the degree of similarity between the tasks for the 
original and interpolated learning is one of the major conditions of 
retroactive inhibition. (Transfer of training, without regard to sign, 
is considered to be a function of the similarity between the two tasks.) 
Similarity of tasks is not, however, a unitary thing. Itis possible that 
two tasks may be highly similar with respect to one characteristic and, 
at the same time, highly dissimilar with respect to another. In rote 
serial learning there is a condition of similarity beyond the types of 
formal and meaningful similarity usually considered. This is the 
similarity induced by the identity of the serial positions of the corre- 
sponding items in the two lists to be learned. This is not an inde- 
pendent condition of similarity for the items must be related in some 
other manner, as for example, by meaning, in order for a condition of 
similarity of serial position to exist. 

McGeoch and McGeoch (1) found that “similarity or identity of 
serial position is not an essential condition for the inhibitory operation 
of synonyms, nor is the relation a regular function of the positional 
disparity.’ The present study is designed to repeat a part of this 
experiment and to extend the problem to include the effects of identity 
and non-identity of serial position when identical words are used in 
the original and interpolated lists. 


METHOD 


In this study the materials to be learned were ten item lists of two syllable adjectives. The 
words were taken from Melton (2), and were so chosen that no two words in a particular list began 
with the same syllable or letter or began or ended with the same sound. An attempt was made 
to keep the words in each list from forming a sequence that was obviously meaningful. In the 
case of the paired lists, where the second list was made up of the synonyms of the words in the 





' From the Laboratories of Psychology at the State University of Iowa. The writer is in- 
debted to Dr. John A. McGeoch who directed the research and to Dr. Kenneth W. Spence for 
his assistance in preparing the manuscript. 

2 By retroactive inhibition is meant the decrement in the recall or relearning of a learned act 
attributable to the learning of a second act between the time of the original learning and its 
recall. 
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erst list, the average association value of the word pairs was 2.35 with a standard deviation of 
approximately 0.25.3 The lists were presented to the subject on a Missouri type memory drum 
at the rate of two sec. per word. Eight sec. elapsed between the time the last word in the list 
disappeared and the cue symbol (before the first word in the list) appeared. The drum was 
mounted in a screen, the E being concealed from the view of the S. The position of the S was 
such that he would receive as few distracting stimuli as possible from other objects in the room. 
Throughout the experiment learning was by the anticipation method. 


EXPERIMENTAL CONDITIONS 


The conditions of this experiment are indicated in Table I. 


TABLE I 
Condition Type of Treatment* 

A OL strials 30sec.instructions 9 min. rest 30 sec. instruction RL to 
for interpolated (Type 1) for RL criterion 
rest 

B OL 5 trials 30sec. instruction ILtotrials 30 sec. instruction RL to 
for IL (Type IT) for RL criterion 

C OL strials 30sec. instruction IL totrials 30 sec. instruction RL to 
for IL (Type Ill) for RL criterion 

D OL 5 trials 30sec. instruction ILtotrials 30 sec. instruction RL to 
for IL (Type IV) for RL criterion 

E OL s trials 30sec. instruction ILtotrials 30sec. instruction RL to 


for IL (Type V) for RL 


criterion 


*In this diagram the abbreviation OL is used for original learning, IL is used for inter- 
polated learning, and RL is used for the relearning of the originally learned list. 


The five trials of original learning were standard for all conditions. The relearning trials 
began exactly 10 min. after the completion of the original learning under all conditions. Re- 
learning consisted of further practice on the originally learned list and was continued until the 
S reached the criterion of two successive perfect trials. The differences between the experimental 
conditions are to be found in the types of interpolated activity engaged in between the time of 
original learning and the time of relearning. The types of interpolated activity are considered 
below. 

Interpolated activity, type I (condition A). The S engaged in a rest activity for the nine- 
min. period available. In order that rehearsal might be prevented, the S was occupied with 
reading jokes for the entire period. The S was led to believe that this was a part of the experi- 
ment. He was instructed to read as many of the jokes as he could during the allotted time and 
to select the three jokes which he thought were the best of those which he had read. His choices 
were recorded by the E£ in his presence. 

Interpolated activity, type II (condition B). The S learned an interpolated list of adjectives 
for 10 trials. The adjectives in the interpolated list were synonyms of the adjectives in the 
original list and they were arranged in the same serial order as the corresponding words in the 
original list. ‘The remainder of the time in the interpolated period was filled with joke-reading 
as in condition A, except that the S was only required to select one joke instead of three. 

Interpolated activity, type III (condition C). The S learned an interpolated list of adjec- 
tives for 10 trials. The adjectives in the interpolated list were synonyms of the adjectives in the 
original list, but they were arranged in a different serial order from the corresponding words in 
the original list. If 4 BCDEFGHI J represents the order of the adjectives in the original 





The pairs of adjectives were scaled with respect to associative value (meaningful similarity) 
on a scale from 0.00 (no similarity) to 3.00 (complete similarity or identity). “The mean associa- 
tive value of 2.35 represents a high degree of similarity between the words in the first list and the 
words in the second list. 
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list, the order of the corresponding synonyms in the interpolated list would be represented },; 
DAHEBIFCJG. The remainder of the interpolated period was filled with joke-reading 
as in condition B. 

Interpolated activity, type IV (condition D). The S learned an interpolated list of adjec- 
tives for ro trials. The adjectives in the interpolated list were the same adjectives that appeared 
in the original list and they were presented in the same serial order. In other words, interpolated 
learning consisted of continued practice on the original list. The S was not aware that he was 
going to continue practice on the first list until the interpolated list was presented to him. The 
remainder of the interpolated period was filled with joke-reading as in conditions B and C. 

Interpolated activity, type V (condition E). The S learned an interpolated list of adjectives 
for 10 trials. The adjectives in the interpolated list were the same adjectives that appeared in 
the original list, but they were arranged in a different serial order. The relative serial order of 
the words in the two lists was the same as under condition C. The remainder of the interpolated 
period was filled with joke-reading as under conditions B, C, and D. 

In all of the conditions involving the learning of an interpolated list, the S was given two recall! 
trials of the interpolated list 30 sec. after the completion of relearning. This was done in an 
effort to keep the S’s motivation for the learning of the interpolated list approximately equal to 
his motivation for the learning of the original list. 

Twenty-five Ss served in this experiment. Each S served under all of the experimental 
conditions. ‘These conditions were counterbalanced as regards practice effects and the effects 
of list upon condition. An attempt was made to balance out the effects of one condition on 
another, insofar as this was possible with 25 Ss, by making each condition precede and follow 
each other condition approximately an equal number of times. 

Before entering upon the experiment proper each S was given two practice sessions under 
conditions which were later duplicated (insofar as procedure is concerned) in the experimental 
sessions. ‘The first practice period used a rest or single-list condition while the second practice 
period used a work or double-list condition. 

Of the 25 Ss who served in this experiment 13 were men and 12 were women. Most of 
them were sophomore students of general psychology who had volunteered for the experiment. | 


RESULTS 


Original learning.—In order to determine the effects of the various 
interpolations upon the recall and relearning of the originally learned 
material, it is important that no differences exist in the amounts of 
original learning as between the various conditions. Degree of 
original learning for the various conditions, as considered from the 


TABLE II 


DEGREE OF ORIGINAL LEARNING UNDER THE VARIOUS CONDITIONS 


Mean Total Number Mean Number Correctly 
Condition Correctly Anticipated Anticipated on the 
in § Trials Fifth Trial 
A 20.56 6.56 
B 22.08 6.92 
C 22.16 6.56 
D 22.40 6.96 
E 21.48 6.68 


standpoint of the mean total number of correct anticipations during 
the five original learning trials and from the standpoint of the mean 
number of correct anticipations on the fifth trial or original learning, 
is presented in Table II. Although none of the differences between 
these means is statistically significant, the problem of differences in 
degree of original learning is not one of significance of difference in a 
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statistical sense, but rather whether the obtained differences in origi- 
nal learning means could reasonably be a condition of subsequent 
differences in recall and relearning of the original list. Since the 
differences in original learning as shown in Table II are very small, 
and since they do not appear to be related to the obtained differences 
in the recall and relearning of the original list, it is the opinion of the 
writer that these differences are not important for the findings of 
this experiment. | 

Recall and relearning.—Four types of measures were taken during 
the relearning trials. These are designated, for the sake of con- 
venience, as follows: 


RC, The number of correct anticipations on the first relearning 
trial. 

RC; The number of correct anticipations on the second relearn- 
ing trial. 

RL, ‘The number of relearning trials up to, but not including, 
the first errorless trial. 

RL, ‘The number of relearning trials up to, but not including, 
the criterial trials. (In this experiment the criterion was 
two successive errorless trials.) 


The scores obtained for each of these measures under the various 
conditions are shown in Table III. 


TABLE III 


Mean RECALL AND RELEARNING SCORES UNDER THE VARIOUS CONDITIONS 











Condition RC: RC2 RLi RL2 
A 5.68 7.92 3.68 5.00 
B 2.16 5-24 5-76 6.44 
C 1.24 4.16 6.12 7.60 
D 8.80 9.64 0.76 1.00 
E 2.04 4-44 6.44 8.48 

















The means of these four scores as between the five conditions were 
compared by means of the ¢ statistic for related measures. ‘These 
comparisons are presented in Table IV. 

In Table IV a positive value of a difference indicates that the first 
condition in the pair had a higher mean score than the second. For 
example, in the difference between conditions A and B with respect 
to RC,, the mean for condition A was 5.68 and the mean for condition 
B was 2.16. The difference between these means is 3.52, and since 
condition A is the left hand, or first, condition, this difference is 
given a positive value. 

Interpolated learning.—Retroactive inhibition is known to be a 
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TABLE IV 





THe DirFERENCES BETWEEN THE Means oF RC), RC2, RL:, anp Rle, AND THE 
t STATISTICS FOR THESE DIFFERENCES 




















RC, RC: RLi RL: 
Cond. |—— 
Diff. t Diff t Diff. t Diff. t 

A-B + 3.52 15.54* +2.48 5.09* — 2.08 3.36" —1.44 1.89 
A-C + 4.44 9.61* + 3.56 7.26* —2.44 2.79* — 2.60 2.00 
A-D — 3.12 5.96* —1.92 6.30* + 2.92 7.63* + 4.00 4.64* 
A-E + 3.64 7.24* + 3.28 7.13* —2.76 3-49" — 3.48 3-21* 
B-C +0.92 1.89 +1.08 2.11 —0.36 0.55 —1.16 0.90 
B-D —6.64 | 12.20* — 4.40 16.63* + 5.00 11.47* +5.44 10.16* 
B-E +0.08 0.26 +0.80 1.79 —0.68 1.05 — 2.04 2.22 
C-D —7.56 | 22.78* —>5.48 13.69* +5.36 7.88* + 6.60 5.62* 
C-E —o.80 1.59 —0.28 1.44 —0.32 0.40 —o.88 0.48 
D-FE +6.76 | 15.89* + 5.20 13.58* — 5.68 8.57* —7.48 8.23* 





























* Represents significance at the one percent level of confidence. 


function of the degree of learning of the original and interpolated 
tasks. In this experiment there were differences in degree of inter- 
polated learning which are, no doubt, attributable to differential 
degrees of positive and negative transfer from the original to the 
interpolated learning. It is possible, however, that this differential 
degree of interpolated learning may be a determining condition of the 
differences in recall and relearning which were obtained between the 
various conditions. ‘That is to say that some of the obtained results 
may be a function of the differential degree of interpolated learning 
in and of itself, regardless of the fact that the conditions determining 
the differential degree of interpolated learning also determine the 
amount of positive and negative transfer from the learning of the 
interpolated list to the relearning of the original list. 

Two measures were taken of the degree of interpolated learning. 
These may be abbreviated as follows: 


IL, The total number of correct anticipations during the 10 
trials of interpolated learning. 
IL, The number of correct anticipations on the tenth trial of 


interpolated learning. 


The means for these scores under the various conditions are given 


TABLE V 


DEGREE OF INTERPOLATED LEARNING. IL; aANp IL2 MEaAnNs 
FOR THE VARIOUS CONDITIONS 


Condition ILi IL2 
B 54-84 8.88 
53-68 8.44 
D 80.24 10.00 


E 49.04 8.12 
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in Table V. The means for the two measures of interpolated learning 
as between the various conditions were compared by means of the / 
statistic. ‘The differences between the means and the corresponding 
t's are given in Table VI. There are several pairs of means that 
differ significantly with respect to degree of interpolated learning. 
The problem, however, is not concerned with the determination of 
the probability of the recurrence of these differences under duplicated 
conditions, but rather with the obtained differences in the degree of 
interpolated learning and with the effect of these differences upon the 
subsequent recall and relearning of the originally learned material. 
The choice is between two alternative hypotheses regarding the 
operation of the inhibitory effects which are evidenced during recall 
and relearning: whether the differences in the recall and relearning 
means are due to differences in the degree of learning of the inter- 


TABLE VI 


MEAN DIFFERENCES AND t’S BETWEEN THE VARIOUS CONDITIONS FOR Two MEASURES 
OF THE DEGREE OF INTERPOLATED LEARNING 


























ILi IL: 
Condition — —— 
Diff. t Diff. t 
B-C + 1.16 0.55 +0.44 1.35 
B-D —25.40 16.60* —1.12 4-34" 
B-E + 5.80 3.02* +0.76 1.95 
C-D — 26.56 12.38* —1.56 5-13°* 
C-E + 4.64 1.61 +0.32 0.86 
D-E + 31.22 12.63* +1.88 4.70* 





* Represents significance at the one percent level of confidence. 


polated lists (which differences, of course, were produced by the 
differential degrees of proactive inhibition and facilitation operating 
from the original learning), or whether the same conditions which 
operated to produce differences in the interpolated learning also 
operated to produce differences in the recall and relearning of the 
originally learned material. The writer favors the latter hypothesis 
as being the most likely. 

Overt tntrusions.—Since the number of overt intrusions is one 
index of the amount of interference between the original list and the 
interpolated list, it is worthwhile to compute the number of overt 
intrusions which occurred under each of the conditions of this experi- 
ment. ‘This was done for the first and second relearning trials. Un- 
fortunately, due to the nature of the experimental design, it was not 
possible to compute the number of overt intrusions directly. Rather, 
it was necessary to arrive at some index of the number of intrusions. 
This was necessary because, under conditions D and FE, it was not 
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possible to differentiate between overt intra-serial errors and overt 
intrusions of an inter-serial nature. Consequently, all of the overt 
errors (intra-serial and inter-serial errors) which were made during 
the first and second relearning trials were computed on the assumption 
that the number of intra-serial errors would remain constant from one 
condition to another. The basis for this assumption is that all of the 
lists occurred an equal number of times with each condition and that 


TABLE VII 
Overt ERRORS DURING RELEARNING 
Mean Number of Overt Mean Number of Overt 
Condition Errors on First RL Trial Errors on Second RL Trial 
A 0.56 0.68 
B 0.44 1.48 
C 0.64 1.60 
D 0.32 0.16 
E 1.72 1.72 


each condition occurred an equal number of times at each stage of 
practice so that, had learning on each condition continued without 
interpolated activity, we should expect approximately an equal num- 
ber of overt intra-serial errors under each condition. If this assump- 
tion is granted, a differential number of overt errors under the various 
conditions during the first two relearning trials must be considered as 


TABLE VIII 


Tue DIFFERENCES BETWEEN THE EXPERIMENTAL CONDITIONS AND THE ConTROL CONDITION 
witH REspecT TO THE MEAN NuMBER OF OVERT ERRORS ON THE FIRST AND SECOND 
RELEARNING TRIALS AND THE t STATISTICS FOR THEM 














Overt Errors, First RL Trial Overt Errors, Second RL Trial 
Condition 
Diff. t Diff. t 
A-B +0.12 0.53 —o0.80 3.42* 
A-C —0.08 0.30 —0.92 $.25° 
A-D +0.24 1.00 +0.52 2.60** 
A-E +1.16 11.04* —1.04 $35" 

















* Represents significance at the one percent level of confidence. 
** The difference between the means of condition A and condition D is significant at the 
two percent level of confidence. 


a function of the interpolated activity. The results presented in 
Tables VII and VIII indicate that, using this measure of negative 
transfer, the only condition to differ significantly from the control 
condition (condition A) on the first relearning trial was condition E. 
On the second relearning trial, however, conditions B and C as well as 
condition E differed significantly from condition A. 
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DIscuUSSION 


If each of the lists of adjectives which was learned in this experi- 
ment is considered as a series of stimulus-response relations, each unit 
serving both as a stimulus for the succeeding unit and as a response 
to the preceding one, it may be seen in those cases where the original 
stimulus response connections established during the original learning 
were altered during the interpolated learning period by the substitu- 
tion of a different response unit to the same or similar stimulus unit, 
that there exists a decrement in the strength of the originally learned 
connections‘ as measured by the recall and relearning methods, 
relative to a condition wherein such substitution of response units did 
not take place (Conditions B, C, and E as contrasted with conditions 
A and D). These results fit the predictions which might be made 
on the basis of the Muller-Schuman paradigms or on the basis of some 
analogous transfer theory of retroactive inhibition. 

In this experiment retroactive inhibition becomes a function of 
the relative serial positions of the original and interpolated items 
only when these items approach identity. The findings of McGeoch 
and McGeoch (1) that “similarity or identity of serial position is not 
an essential condition for the inhibitory operation of synonyms” have 
been confirmed by the two conditions of the present experiment 
which employed synonyms of the originally learned items as inter- 
polated material. The inhibitory effect of the interpolation of the 
same units in a different serial order may be reduced to the same 
theoretical basis as the effect of changing the response items entirely, 
namely the formation of a second response to a given stimulus and 
the resulting decrement in the probability of occurrence of a response 
which had been previously learned to that stimulus. Since the 
amounts of retroactive inhibition obtained under the three conditions 
producing negative transfer did not differ significantly, it may be 
stated that the changed serial order of an originally learned list of 
adjectives serving as material for interpolated learning produces as 
much retroactive inhibition as does the interpolated learning of a 
series of adjectives which are related to the words in the originally 
learned list as synonyms, whether these interpolated synonyms be in 
the same or in a changed serial order relative to the serial order of the 
corresponding words in the originally learned list. 


CONCLUSIONS 


The findings of this experiment may be summarized as follows: 
1. In the serial learning of adjectives by the anticipation method, 
a changed order of the originally learned material serving as material 


‘ By decrement in the strength of connections is meant merely the relatively lower probability 
that the response will occur upon, the presentation of the stimulus. 
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for interpolated learning produces a decrement in the recall and 
relearning of the originally learned material. 

2. This decrement is not significantly different from decrements 
which may be produced by the interpolated learning of synonyms of 
the originally learned material either in the same or in a changed 
serial order. 

3. Insofar as the measure of overt intrusions employed in this 
experiment is valid as a measure of negative transfer, it would appear 
more negative transfer occurred under condition E than under 
conditions B and C, the other two conditions under which negative 
transfer was obtained. 


(Manuscript received October 12, 1945) 
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THE INFLUENCE OF BELIEF AND DISBELIEF IN ESP 
UPON INDIVIDUAL SCORING LEVELS 


BY GERTRUDE RAFFEL SCHMEIDLER AND GARDNER MURPHY 


This study reports research done at Harvard University between 
1942 and 1945 with the support of the Richard Hodgson Fund.* 
The first phases of the work, and a brief note on the psychological 
atmosphere of the various working conditions, have been published 


elsewhere (5, 6, 7). 

The hypothesis proposed in 1942 was that when tested by equally 
rigid procedures those believing in the possibility of ESP should give 
themselves to the task with greater freedom and thus make higher 
scores than those rejecting its theoretical possibility, so that in time 
significant differences in scoring level between these two groups 
should be achieved. ‘This hypothesis was in part derived from the 
studies of Pratt and Price (3) and of Stuart (8), suggesting the im- 
portance of attitude and set in ESP scoring level. 


PROCEDURE 


Ss were asked simply whether they could accept the theoretical possibility of the existence 
of ESP; to keep the situation informal, a rigid phrasing of the question was not considered neces- 
sary. Before proceeding further they were classified as accepting, or not accepting, this possi- 
bility. The experimental procedure used was to place decks of 25 cards, each card bearing the 
symbol square, cross, circle, waves, or star—or simply to place lists of such symbols—in conceal- 
ment (in a room or closet away from the S) and to require the S to guess the symbols in order. 
In the first three series two rooms in the Harvard Psychological Laboratory in Emerson Hall 
were used, the rooms being about 30 feet apart and the doors closed. The S worked in one 
room, the £, with the stimulus material, being in the other. In the remaining three series, per- 
formed at the Harvard Psychological Clinic, the E placed the stimulus material in a closet and 
closed the door before the S arrived for the experimental session, thereafter remaining in the 
room with the S. To insure that the S could have no normal means of guessing the true order 
of the cards (or lists), and that the E could not influence the S’s calls by subliminal cues, tables 
of random numbers were used by an assistant in preparing the stimulus material, so that the E 
herself did not at the time know the order of symbols. (The assistant was not present at the 
experiment.) Such random numbers are numbers the order of which cannot be normally guessed 
according to a ‘system’; in five of the six series they were drawn from Tippitt’s tables of random 
numbers. ! 

When it began to be evident in the spring of 1943 that the data were consistently conforming 
to the hypothesis that scoring level would be higher among those accepting the possibility of ESP, 
the procedure was repeated with new Ss; we were not content with two such series, and four 





* Upon being appointed Richard Hodgson Fellow in 1942, G. M. outlined an area of research 
and requested G. R. S. to develop a procedure to be followed under his supervision; the develop- 
ment of the procedure, and the research itself, were the work of G. R. S. 

1'The records of guesses made, and of symbols actually used, are available for inspection. 
The hits have been twice checked, but a third checking by an outsider would be welcomed. 
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additional series of Ss have been tested. No one of the six series of tests comprises less than 
200 runs of 25 cards each, each such series giving results in conformity with the hypothesis. ? 


The tables below give the number of Ss, number of runs, critical 
ratio and P-value for comparison of the groups differentiated by 
attitude toward the existence of ESP. It is not the statistical values, 
however, but the consistency of the differences between the two 
groups, which we would especially emphasize.’ 


TABLE I 


Scores OsTAINED IN AN ESP Carp-Guessinc ExperiIMENT BY Ss Wuo, IN THIS EXPERIMENT, 
ACCEPTED THE PossIBILITY OF PARANORMAL EXPERIENCE AND BY Ss Wuo REJECTED 
THE PossIBILITY OF PARANORMAL Success IN SCORING 


A. Ss who accepted the possibility of paranormal experience 



































Series 1 Series 2 Series 3 Series 4 Series 5 Series 6 
eee 12 12 22 9 23 19 
No. cffuns...........1 29 127 133 162 207 171 
Deviation from 
‘chance expectation’ +56 | +33 +31 +34 +45 +27 
Mean score........... 5-43 | 5-26 5-23 5.21 5-22 5.16 
B. Ss who rejected the possibility of paranormal experience 
| reer re 4 4 4 3 3 16 
Peas G6 OUMR........... 200 175 199 54 27 144 
Deviation from 
‘chance expectation’ —10 —12 —12 —4I —23 —26 
Mean score........... 4.92 4.93 4.94 4.24 4.15 4.82 























The data from the same Ss are summed in Table II. 

No “‘special sensitives,” or individuals with extraordinarily high 
scores, were discovered. ‘The effects, in absolute terms, are slight; 
indeed, the paranormal would probably never have become a con- 
troversial problem if the effects with ordinary normal people were 
large. But they seem to be as consistent as human material can be 
expected to be. Exactly what conditions must be fulfilled it is as yet 


2 The reader may bef interested to know how the number of runs was determined for each S. 
When beginning the investigation it was felt, in the light of earlier work, that ‘crowding’ the Ss 
who rejected the possibility of paranormal experience might tend to drive scores down, and 
they were accordingly asked to do 50 runs at a session. On several occasions the assistant had 
prepared an insufficient number of decks and less than 50 runs were done. The Ss in the group 
accepting the possibility of the paranormal worked at a more leisurely pace—normally to runs, 
sometimes fewer. Despite the fact that with random numbers one cannot pile up differences 
between two groups by manipulation of the number of calls to be made, it was felt at the end of 
the third series that an absolutely rigid rule should be instituted; during the last three series each 
S, regardless of his attitude, made exactly nine runs in each experimental session. In Series IV 
each S served in two sessions; in Series V and VI each S served in only one session. 

3 Extensive mathematical and empirical work (1, 2, 9) has shown that the chance expectation 
is 5.00 (the sigma for a run of 25 guesses being 2.00 with targets made up randomly, as are ours). 
It is not, however, the theoretical expectation, but the empirical difference between these two 
groups, which is to be stressed. 








INFLUENCE OF BELIEF AND DISBELIEF IN ESP 273 


TABLE Il 


CoMPARISON OF Scores OBTAINED IN AN ESP Carp-cuessinc Experiment sy Ss Wuo, IN 
THIS EXPERIMENT, ACCEPTED THE PossIBILITY OF PARANORMAL EXPERIENCE AND 
BY Ss Wuo REJECTED THE PossIBILITY OF PARANORMAL Success IN SCORING 


Ss Who Accepted the Possibility Ss Who Rejected the Possibility 





of Paranormal Experience of Paranormal Experience 

No. of Ss... alae a Mase ead ade eed 97 34 
No. of runs. . Kes Pee eee 799 
Deviation from ‘chance 

Ee ere —124 
ERE Sree eee ee 5-24 4.84 
NP er ane we ea ea 2.03 1.97 
DPC ceclaccent uneneaeradvieawebie a .O7 .O7 
0 Serer Wer rene Seren aces eee ae 3-71 2.19 
(ee ee re ere er ee .OOOI Ol 
eee 
EIS ccc nvswesesvecvadssvessececs ait 
S.D.pirt. 10 
CRpirt** 4.00 
| ES .00003 


* CR’s of 3.71 and 2.19 are based on deviations from chance expectation (5.00). 
** CR of 4.00 is based on the empirical D/op, comparing the two groups. 


impossible to say; there are many unspecified factors to be discovered. 
With their discovery, increasing control should be possible. But 
that one important factor is attitude seems clear. 

Some interest attaches to the fact that the sub-chance scores of 
those rejecting the possibility of the paranormal are significant at the 
one percent level, suggesting (as has much earlier work) that negative 


Notation for Subjeets who, in this experiment, 


- acoepted the possibility of paranormal 
experience 


rejected the possibility of paranormal 
experience 
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Series Series Series Series Series Series 
I II Il! IV v VI 


Fic. 1. A comparison of scores obtained from a card-guessing experiment by Ss who (in 
this experiment) either accepted the possibility of paranormal experience or rejected the possi- 
bility of paranormal experience. 
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attitudes do more than to block positive scoring; they appear to act 
like other forms of negativism, leading to ‘contrary’ behavior. It is 
to be noted, however, that we are dealing with general trends, not 
with an exceptionless tendency manifest in every individual’s work. 
Table III shows the relative frequency of scores above, at, and below, 
the chance-level (five hits per run) in each of the two groups. 


TABLE III 


NuMBER OF Ss wHosE Tota ESP Scores were ABOVE CHANCE, AT CHANCE, 
OR BELOW CHANCE 


A. Ss who accepted the possibility of paranormal experience 






































Series Above Chance At Chance Below Chance Total Number 

I 8 3 I 12 

2 6 Oo [6 12 

3 15 Oo 17 22 

4 5 1 3 9 

5 13 3 7 23 

6 13 2 4 19 
Total number of Ss 60 28 97 
Percent 62% 9%. 29% 100% 

B. Ss who rejected the possibility of paranormal experience 

I 2 O 2 4 

2 2 Oo 2 4 

3 2 oO 2 4 

4 ) 3 3 

5 “ - 3 3 

6 4 I I! 16 
Total number of Ss 10 I 23 34 
Percent 29% 3% 68% 100% 

















PERSONALITY FACTORS 


Now that this result has been confirmed, attention has been 
shifting from this repetition to the question of dynamics; specifically, 
to the individual psychology of those who can and those who cannot 
obtain the higher scores. A number of hypotheses were formulated 
in 1944 as to the kinds of personalities likely to succeed in ESP tasks, 
and as to the responses on the Rorschach test and the Thematic 
Apperception Test likely to identity such Ss; work in the improved 
formulation and testing of these hypotheses is in progress. 

During the summer of 1943, work with the Rosenzweig Picture 
Frustration Test (4)* disclosed a possible relationship between frustra- 


‘This test consists of 24 pictures, each of which shows a frustrating situation. The S is 
required to write into a ‘balloon’ the words which the frustrated person would say. Each response 
is scored as either extrapunitive (aggression is directed against the environment in the form of 
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tion and shift in scoring level.’ In the case of Ss accepting the theo- 
retical possibility of the paranormal, the experience of taking the test 
appears to result in a rise in ESP scoring level which is positively re- 
lated to adjustment status as shown by the test; thus a markedly well 
adjusted S is likely to show a rise in ESP scoring level, a markedly mal- 
adjusted Sadecline. This effect requires confirmation with more Ss. 

It is believed that these personality studies, and others to be 
added, should be intensively followed for a period of several years. 
It is possible that through the use of such methods one might dis- 
cover the attitudes which facilitate or inhibit the expression of par- 
anormal abilities, and the kinds of personalities most fruitful for 
parapsychological study. Ultimately, of course, the purpose of 
locating people with such endowments is to enable us to get at the 
etiology and dynamics of these processes. 


SUMMARY 


Ss accepting the theoretical possibility of ESP and Ss rejecting 
this possibility were compared in respect to levels of scoring in guess- 
ing concealed symbols prepared by random numbers. In each of 
six extensive series of such tests those accepting this theoretical 
possibility scored higher than the others; the P of the overall differ- 
ence being due to ‘chance’ is .00003. Personality factors relating 
to scoring success are discussed. 


(Manuscript received September 14, 1945) 
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blame, etc.), intropunitive (aggression is directed to the subject himself in the form of self- 
depreciation, etc.) or impunitive (the situation is glossed over by denying the existence of the 
frustration or by minimizing the responsibility of all concerned). Rosenzweig has tentative 
norms, listing the most usual type of response to each picture. An S’s adjustment-status 
score shows the number of times in which his answer conforms to the expected type. Thus the 
S with high adjustment status is the one whose characters are angry at the usual times, apologize 
when most others would, and on the appropriate occasions say, ““That’s all right—no harm done.” 

’ Shift if scoring level means change in ESP score, determined by subtracting the number 
of ‘hits’ in the course of the three runs immediately preceding the administration of the P-F 
test from the number of hits in the three runs immediately following the P-F test. 
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